CN109934227A

CN109934227A - System for recognizing characters from image and method

Info

Publication number: CN109934227A
Application number: CN201910191038.4A
Authority: CN
Inventors: 周钊; 郑莹斌; 叶浩
Original assignee: Shanghai Chengguan Information Technology Co Ltd
Current assignee: Shanghai Chengguan Information Technology Co Ltd
Priority date: 2019-03-12
Filing date: 2019-03-12
Publication date: 2019-06-25

Abstract

A kind of system for recognizing characters from image, the system include: detection model and/or identification model.Detection model is for regional location locating for text in detection image；Identification model is for extracting text information in image；Operation module calls detection module and/or identification module, and the text in image is detected and identified, preliminary annotation results are obtained；Sample selection module selects some or all of sample in preliminary annotation results；Correction module is marked, the select annotation results of sample selection module are modified, obtains fining annotation results, the annotation results of the fining are used to that detection model and/or identification model to be continued training and optimized.

Description

System for recognizing characters from image and method

Technical field

The invention belongs to technical field of image processing, in particular to a kind of system for recognizing characters from image and method.

Background technique

Pictograph intelligent Understanding technology, refers to using artificial intelligence approach and model, is handled image and analyzed, And judge specific position and the content of text in image.The identification of specific location and content for text in image is sentenced It is disconnected, help to understand entire image, while being structuring abstract image content, key message judgement provides foundation, solves practical Problem.Such as it helps to realize sensitive text information identifies, intelligence OCR document identifies etc..

Pictograph intelligent Understanding technology may include 2 steps, comprising:

1. extracting the region where the word content in image, this process is also referred to as pictograph detection；

2. the image local area or whole image zooming-out word content therein, this process of pair input are also referred to as image Text region.

Intelligent Understanding processing for pictograph, can be used a step in above-mentioned two step, can also be with It is used in the combination (first carry out pictograph detection, then carry out pictograph identification) of two steps.

Existing general image text intelligent Understanding method, the data set usually completed according to text marking, instruction Practise pictograph detection and (or) pictograph identification model.This method relies primarily on deep neural network algorithm.The algorithm Model training of the more data sample for deep neural network is generally required, problem is detected and identified specific to pictograph On, it is necessary to the mark of text point or word content is carried out to the image containing text information of magnanimity.

Since existing general image text intelligent Understanding method is often directly labeled using all data, then benefit Model training is carried out with marked content, it is contemplated that the image with word content often has 1 or more character area, this method Not only mark amount is big, since there may be more replicated literal marks, so that efficiency is lower.

Summary of the invention

The embodiment of the present invention provide it is a kind of for system for recognizing characters from image and method, using Intelligent Hybrid mask method, Solve the text detection model that part mark sample is used only in current intelligent text understanding and Text region model accuracy not Height, new training data mark is difficult, artificial to mark the big problem of cost.

One of embodiment of the present invention is used for image character recognition method, follows the steps below based on man-machine mixing mark The pictograph intelligent Understanding of note:

The text detection and identification model on basis are called, text and word content in image is obtained, which is made For preliminary annotation results；

Algorithm is automatically selected using sample and selects part of or whole sample, then is marked by artificial correction, is manually repaired Positive content may include: that adjustment has the testing result of error, corrigendum identification content, the mark of polishing gaps and omissions, deletion error As a result；

Fining annotation results are taken out, for further training detection model and/or identification model, improve model essence Degree；

Above step is repeated, until the result for obtaining meeting demand.

Method described in the embodiment of the present invention can only mark portion in the detection and identification process to pictograph The word content divided understands for pictograph, reduces the data mark of pictograph intelligent Understanding system and builds cost.Through One or more above-mentioned man-machine mixing marks and model training process are crossed, system for recognizing characters from image will obtain more preferably text reason Solve result.Wherein artificial mark can also have programming system to be automatically performed.

The present invention can mark new training data by way of automatic or man-machine mixing marks to optimize text detection Model and Text region model, iterative raising model accuracy.Particularly, when being labeled to new data, can pass through The operation module for calling existing text detection and/or Text region model to form, carries out preliminary mark in advance, can reduce To the repeat mark of similar image data, required artificial mark workload is greatly decreased, rapid build goes out the figure of meet demand As text intelligent Understanding model.

Detailed description of the invention

The following detailed description is read with reference to the accompanying drawings, above-mentioned and other mesh of exemplary embodiment of the invention , feature and advantage will become prone to understand.In the accompanying drawings, if showing by way of example rather than limitation of the invention Dry embodiment, in which:

Fig. 1 according to embodiments of the present invention one of the recognition methods flow diagram to text contained in image.

Fig. 2 according to embodiments of the present invention one of the recognition methods flow diagram to text contained in image.

Fig. 3 according to embodiments of the present invention one of the recognition methods flow diagram to text contained in image.

Fig. 4 according to embodiments of the present invention one of the identifying system flow diagram to text contained in image.

Fig. 5 according to embodiments of the present invention one of the identifying system flow diagram to text contained in image.

Fig. 6 according to embodiments of the present invention one of the identifying system flow diagram to text contained in image.

Specific embodiment

According to one or more embodiment, as shown in Figure 1, a kind of image character recognition method, is used for institute in image The text information for including identifies, comprising the following steps:

S101 is known with the text detection model on pictograph data set that is existing or having marked training basis and text Other model；Text detection model herein is applicable to all kinds of image text detection models, includes but not limited to: being directed to standard square The model (such as Connectionist Text Proposal Network) of shape collimation mark note centainly rotates angle for having Character machining model (such as RRPN, EAST, DMP-Net), for arbitrary shape character machining model (such as Total-Text, TextSnake etc.).Text identification model is applicable to all kinds of image text identification models, includes but not limited to: being based on The depth network model of Attention, depth network model based on CTC etc..If be labelled in text using existing The image data set of appearance trains preliminary model, then the word content that may be implemented that the part marked is used only is for image Text understands, to reduce the data mark cost of pictograph system.The word content marked, the source of mark It can be academia and disclose the character machining used or identification data set, be also possible to the literal field marked by hand by implement team Domain or content.

Text detection model and Text region model are formed operation module by S102；For an input picture, mould is operated Block is by the region where calling text detection model to calculate text first, and the partial region is after certain geometric transformation (will such as tilt or curved region is straightened), is re-used as the input of Text region model, to obtain image context block domain Position and content；

S103, call operation module carry out text detection and identification to new image data, and will test recognition result work For preliminary annotation results, input system subsequent step；

S104 calls sample to automatically select algorithm, selects all or part of sample in preliminary annotation results.Sample is certainly Dynamic selection algorithm includes but not limited to: the sample selection algorithm based on sample confidence threshold value, based on the distribution of sample confidence level Sample selection algorithm；

S105 is manually modified testing result and recognition result respectively, revises text all in simultaneously polishing image Frame, while the corresponding content of text of textbox is corrected, obtain fining annotation results；

S106 further trains detection model and identification model with fining labeled data, while to text detection model Tuning is carried out with Text region model；Each model used in step S101 can be used in model training, it is possible to use increment type Other models of study；

Detection model and identification model replacement basic model after optimization is integrated into operation module, constructed more excellent by S107 The pictograph of change understands identifying system；

S108, returns to step S103, carries out next round iteration optimization, until obtain meet demand text detection model and Text region model.The standard of meet demand includes but not limited to one or more standard below: in test set or verifying collection On accuracy rate be more than certain preset value, test set or verifying collection on recall rate be more than certain preset value, model Parameter amount is lower than certain preset value, and runing time of the model on test set or verifying collection is lower than certain preset value, etc..

A complete pictograph detection and recognition methods are present embodiments provided, it is relevant to can be applied to various OCR Application scenarios.

According to one or more embodiment, under certain application scenarios, it is only necessary to detect the word content area on image Domain, and Text region is not required directly.For example, judging whether be inserted into the application such as copy or subtitle in image.This Embodiment is optimized for image text detection model, and flow chart is as shown in Figure 2.Comprising steps of

S201, with pictograph data set that is existing or having marked training basis text detection model, and by its As operation module；

S202, call operation module carry out text detection to the image data newly inputted, and will test recognition result conduct Preliminary annotation results, the input as subsequent processing；

S203 calls sample to automatically select algorithm, selects all or part of sample in preliminary annotation results；

S204 is manually modified preliminary testing result, comprising: text of the adjustment with text box devious is deleted Unless the callout box of character area, the callout box etc. of polishing gaps and omissions, to obtain fining annotation results；

S205 further trains detection model with fining labeled data；

Detection model after optimization is integrated into operation module by S206；

S207 returns to step S202 and carries out next round iteration optimization, until obtaining the text detection model of meet demand.

Pictograph detection method is optimized in the present embodiment, more prominent in the effect of certain applications.For example, Judge whether be inserted into copy and subtitle in image；For another example, not appropriate due to data sample in some applications Or available Text region model (identification of such as ancient writing and the identification of foreign language text), text is carried out by computer automatically Region detection, then domain expert is transferred to carry out artificial Text region.

According to one or more embodiment, as shown in figure 3, a kind of image character recognition method, comprising the following steps:

S301 knows the text detection model on pictograph data set that is existing or having marked training basis and text Other model, and will test and form operation module with identification model；

S302, call operation module carry out text detection and identification to new image data, and by recognition result therein Input as preliminary annotation results and follow-up system；

S303 calls sample to automatically select algorithm, selects all or part of sample in preliminary annotation results；

S304 is manually corrected preliminary recognition result, obtains fining annotation results；

S305 further trains identification model with fining labeled data；

Identification model after optimization is integrated into operation module by S306；

S307 returns to step S302 and carries out next round iteration optimization, until obtaining the Text region model of meet demand.

The present embodiment has carried out further optimization for Text region model, carries out only for pictograph identification model Iteration.Towards application scenarios, such as the bill of certain fixed forms can directly obtain text by way of template matching The position in region, therefore only need to update identification model.

According to one or more embodiment, as shown in figure 4, a kind of system for recognizing characters from image, which includes: detection Model and/or identification model.Detection model is for regional location locating for text in detection image；Identification model is for extracting figure The text information as in.And

Operation module calls detection module and/or identification module, the text in image is detected and identified, obtains Preliminary annotation results；

Sample selection module selects some or all of sample in preliminary annotation results；

Correction module is marked, the select annotation results of sample selection module are modified, obtains fining mark As a result, the annotation results of the fining are used to that detection model and/or identification model to be continued training and optimized.

A complete pictograph detection and identifying system are present embodiments provided, it is relevant to can be applied to various OCR Application scenarios.

According to one or more embodiment, as shown in figure 5, a kind of system for recognizing characters from image, which includes: detection Model, for regional location locating for text in detection image.Operation module, call detection module, to the text in image into Row detection, obtains preliminary annotation results；Sample selection module selects some or all of sample in preliminary annotation results This；Correction module is marked, the select annotation results of sample selection module are modified, obtains fining annotation results, The annotation results of the fining are used to continue training and optimization to detection model.

Pictograph detection model is optimized in the system for recognizing characters from image of the present embodiment, in certain applications Effect is more prominent.For example, judging whether be inserted into copy and subtitle in image；For another example, in some applications due to data Sample reason does not have appropriate or available Text region model (identification of such as ancient writing and the identification of foreign language text), by Computer carries out word area detection automatically, then domain expert is transferred to carry out artificial Text region.

According to one or more embodiment, as shown in fig. 6, a kind of system for recognizing characters from image, which includes: detection Model and/or identification model, detection model is for regional location locating for text in detection image；Identification model is for extracting figure The text information as in.And

Correction module is marked, the select annotation results of sample selection module are modified, obtains fining mark As a result, the annotation results of the fining are used to continue training and optimization to identification model.

The system for recognizing characters from image of the present embodiment has carried out further optimization for Text region model, only for figure As Text region model is iterated.Towards application scenarios, such as the bill of certain fixed forms can pass through template matching Mode directly obtain the position of character area, therefore only need to update identification model.

According to one or more embodiment, a kind of man-machine mixing mask method towards pictograph intelligent Understanding be System, wherein text intelligent Understanding includes two big functions of pictograph region detection and identification.System utilizes initial training Basic text detection model and basic Text region model, preliminary text detection and (or) knowledge are carried out to image to be marked Not；System will be exported according to preliminary as a result, part of or whole sample is selected using algorithm is automatically selected, by artificial school To amendment detection and/or recognition result, further fining mark is carried out to image；Finally the annotation results of fining are used Text detection and/or identification model in training basis, to improve the accuracy of detection model and identification model.By one Or multiple above-mentioned man-machine mixing marks and model training process, system will obtain more preferably text and understand result.

According to one or more embodiment, a kind of pictograph identification network platform, the network platform includes service Device, server have memory；And

It is coupled to the processor of the memory, which is configured as executing the finger of storage in the memory It enables, the processor executes following operation:

With pictograph data set that is existing or having marked training text detection model and Text region model；

Text detection model and Text region model are formed into operation module；

Call operation module carries out text detection and identification to image data, and will test recognition result as tentatively Annotation results；

It calls sample to automatically select algorithm, selects all or part of sample in preliminary annotation results；

Testing result and recognition result are modified respectively, revise textbox all in simultaneously polishing image, while school The corresponding content of text of text word frame obtains fining annotation results；

Text detection model and Text region model are further trained with fining labeled data, while to text detection mould Type and Text region model carry out tuning；

By the text detection model and the original text detection model of Text region model replacement and Text region after optimization Model constructs more optimized pictograph and understands identifying system.

According to one or more embodiment, a kind of pictograph identification server, server has memory；And coupling The processor of the memory is closed, which is configured as executing the instruction of storage in the memory, the processing Device executes following operation:

Text detection model and Text region model are formed into operation module；

It should be understood that in embodiments of the present invention, term "and/or" is only a kind of incidence relation for describing affiliated partner, Indicate may exist three kinds of relationships.For example, A and/or B, can indicate: individualism A exists simultaneously A and B, individualism B this Three kinds of situations.In addition, character "/" herein, typicallys represent the relationship that forward-backward correlation object is a kind of "or".

Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This A little functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Specially Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not It is considered as beyond the scope of this invention.

In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components It can be combined or can be integrated into another system, or some features can be ignored or not executed.In addition, shown or beg for Opinion mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING of device or unit Or communication connection, it is also possible to electricity, mechanical or other form connections.

The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.Some or all of unit therein can be selected to realize the embodiment of the present invention according to the actual needs Purpose.

It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, is also possible to two or more units and is integrated in one unit.It is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.

If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention Portion or part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey The medium of sequence code.

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or replace It changes, these modifications or substitutions should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with right It is required that protection scope subject to.

Claims

1. a kind of system for recognizing characters from image, which includes: detection model and/or identification model,

Detection model, for regional location locating for text in detection image；

Identification model, for extracting text information in image；And

Correction module is marked, the select annotation results of sample selection module are modified, obtains fining annotation results, The annotation results of the fining are used to that detection model and/or identification model to be continued training and optimized.

2. a kind of system for recognizing characters from image, the system include:

Detection model, for regional location locating for text in detection image；

Operation module calls detection module, detects to the text in image, obtain preliminary annotation results；

Correction module is marked, the select annotation results of sample selection module are modified, obtains fining annotation results, The annotation results of the fining are used to continue training and optimization to detection model.

3. a kind of system for recognizing characters from image, which includes: detection model and/or identification model,

Detection model, for regional location locating for text in detection image；

Identification model, for extracting text information in image；And

Correction module is marked, the select annotation results of sample selection module are modified, obtains fining annotation results, The annotation results of the fining are used to continue training and optimization to identification model.

4. a kind of image character recognition method, method includes the following steps:

With pictograph data set that is existing or having marked training text detection model and Text region model；Text is examined It surveys model and Text region model forms operation module；

Call operation module carries out text detection and identification to image data, and will test recognition result as preliminary mark As a result；

Testing result and recognition result are modified respectively, revise textbox all in simultaneously polishing image, while correcting text The corresponding content of text of word frame obtains fining annotation results；

Further train text detection model and Text region model with fining labeled data, at the same to text detection model and Text region model carries out tuning；

By after optimization text detection model and Text region model replace original text detection model and Text region model, It constructs more optimized pictograph and understands identifying system.

5. a kind of image character recognition method, method includes the following steps:

Text detection model is trained with pictograph data set that is existing or having marked, and as operation module；

Call operation module carries out text detection to image data, and will test recognition result as preliminary annotation results；

Preliminary testing result is modified, fining annotation results are obtained；

Detection model is further trained with fining labeled data；

Detection model after optimization is integrated into operation module.

6. a kind of image character recognition method, method includes the following steps:

By pictograph data set that is existing or having marked training text detection model and Text region model, and will test Operation module is formed with identification model；

Call operation module carries out text detection and identification to image data, and using recognition result as preliminary annotation results；

Preliminary recognition result is corrected, fining annotation results are obtained；

Identification model is further trained with fining labeled data；

Identification model after optimization is integrated into operation module.

7. a kind of pictograph identifies the network platform, which is characterized in that the network platform includes server, and server, which has, to be deposited Reservoir；And

It is coupled to the processor of the memory, which is configured as executing the instruction of storage in the memory, institute It states processor and executes following operation:

Text detection model and Text region model are formed into operation module；

8. a kind of pictograph identifies that server, server have memory；And

9. a kind of storage medium, is stored thereon with computer program, which is characterized in that when the program is executed by processor, realize Method as described in any in claim 4 to 6.