CN109447015A

CN109447015A - A kind of method and device handling form Image center selection word

Info

Publication number: CN109447015A
Application number: CN201811317237.7A
Authority: CN
Inventors: 李鹏辉; 竺晨曦; 邱锡鹏
Original assignee: Shanghai Rhinoceros Technology Co Ltd
Current assignee: Shanghai Rhinoceros Technology Co Ltd
Priority date: 2018-11-03
Filing date: 2018-11-03
Publication date: 2019-03-08

Abstract

The present invention provides a kind of method for handling form Image center selection word, comprising: the disturbance ecology item in removal form Image；Using joint training model, brief note is selected in form Image centre circle, obtain brief note position coordinates of the brief note in form Image and identifies word content corresponding with brief note；Table reduction is carried out to brief note, brief note position coordinates and word content using table characteristic.Implement the device of the above method, comprising: for removing the preprocessing module of disturbance ecology item in form Image；Using joint training model, brief note is selected in form Image centre circle, obtain brief note position coordinates of the brief note in form Image and identifies the identification module of word content corresponding with brief note；The table recovery module of table reduction is carried out to brief note, brief note position coordinates and word content using table characteristic.The present invention can promote Text region and reduction accuracy rate in form Image.

Description

A kind of method and device handling form Image center selection word

Technical field

The present invention relates to a kind of form processing method, especially a kind of method and dress for handling form Image center selection word It sets.

Background technique

It is higher for the accuracy rate of the big section Text region of similar A4 paper in OCR identification field.But it is directed to table Identification, industry accuracy rate is not very high at present.It cuts word knowledge because original and will cause be difficult to carry out layout reversion otherwise, And the information in table can not be utilized.

Summary of the invention

Aiming at the shortcomings existing in the above problems, the present invention, which provides one kind, can promote Text region in form Image With a kind of method and device of processing form Image center selection word of reduction accuracy rate.

To achieve the above object, the present invention provides a kind of method for handling form Image center selection word, including following step It is rapid:

Step 1, to remove the disturbance ecology item in form Image；

Step 2, using joint training model, select brief note in form Image centre circle, obtain brief note in form Image Brief note position coordinates simultaneously identify word content corresponding with brief note；

Step 3 carries out table reduction to brief note, brief note position coordinates and word content using table characteristic.

The method of above-mentioned a kind of processing form Image center selection word, wherein in step 1, form Image is carried out Pretreatment includes picture angle correction behaviour to the pretreatment that form Image carries out to remove the disturbance ecology item in form Image Make or the removal of watermark seal operates.

A kind of method of above-mentioned processing form Image center selection word, wherein in step 2, including following sub-step:

Step 21 carries out RGB three-channel processing to the form Image of removal disturbance ecology item, to form at least two tables Picture layer；

Step 22 carries out feature extraction to each tabular drawing lamella by convolution transform；

Step 23, in the first tabular drawing lamella, predict brief note position coordinates of the brief note in the first tabular drawing lamella；

Step 24, in the second tabular drawing lamella, obtained by image information and applicational language model corresponding with brief note Word content.

A kind of method of above-mentioned processing form Image center selection word, wherein in step 23, brief note position coordinates packet Include top-left coordinates (x0, y0), upper right coordinate (x1, y1), lower right coordinate (x2, y2), lower-left coordinate (x3, y3).

The method of above-mentioned a kind of processing form Image center selection word, wherein in step 3, sat according to brief note position Mark carries out the cutting of table row and grid column, and word content is imported in brief note position, carries out cell according to Semantic judgement In conjunction with to complete the reduction of whole table.

The method of above-mentioned a kind of processing form Image center selection word, wherein after further including step 4, reduction being presented Table.

The present invention also provides a kind of devices for handling form Image center selection word, comprising: preprocessing module, identification module With table recovery module；

Preprocessing module, for removing the disturbance ecology item in form Image；

Identification module selects brief note in form Image centre circle, obtains brief note in form Image using joint training model Brief note position coordinates and identify word content corresponding with brief note；

Table recovery module carries out table reduction to brief note, brief note position coordinates and word content using table characteristic.

Above-mentioned device, wherein the pretreatment that the preprocessing module carries out form Image includes picture angle correction Operation or the removal operation of watermark seal.

Above-mentioned device, wherein the implementation steps of the identification module are as follows:

RGB three-channel processing is carried out to form Image, to form at least two tabular drawing lamellas；

Feature extraction is carried out to each tabular drawing lamella by convolution transform；

In the first tabular drawing lamella, brief note position coordinates of the brief note in the first tabular drawing lamella are predicted；

In the second tabular drawing lamella, obtained in text corresponding with brief note by image information and applicational language model Hold.

Above-mentioned device, wherein the table recovery module carries out table row and grid column according to brief note position coordinates Cutting imports word content in brief note position, and the combination of cell is carried out according to Semantic judgement, to complete going back for whole table It is former.

Compared with prior art, the invention has the following advantages that

It by the textbox choosing based on table and identifies progress joint training deep learning model, frame is made to select and identify two Task can keep final table Text region more accurate, and do not lose the space of a whole page of table itself with the image information of public table Information promotes the accuracy rate of table layout reversion.

Detailed description of the invention

Fig. 1 is the flow chart of method part in the present invention；

Fig. 2 is the structural block diagram of device part in the present invention.

Main appended drawing reference is described as follows:

1- preprocessing module；2- identification module；3- table recovery module；Module is presented in 4-

Specific embodiment

As shown in Figure 1, the present invention provides a kind of method for handling form Image center selection word, comprising the following steps:

Disturbance ecology item in step 1, removal form Image.

In step 1, form Image is pre-processed, to remove the disturbance ecology item in form Image, to tabular drawing The pretreatment that piece carries out includes picture angle correction operation or the removal operation of watermark seal.

Step 2, using joint training model, select brief note in form Image centre circle, obtain brief note in form Image Brief note position coordinates simultaneously identify word content corresponding with brief note.

In step 2, including following sub-step:

Wherein, brief note position coordinates include top-left coordinates (x0, y0), upper right coordinate (x1, y1), lower right coordinate (x2, y2), Lower-left coordinate (x3, y3).

In step 3, the cutting that table row and grid column are carried out according to brief note position coordinates, imports brief note for word content In position, the combination of cell is carried out according to Semantic judgement, to complete the reduction of whole table.

The table after reduction is presented in step 4.

The training process of joint training model is as follows:

1. generating table, and enclose the corresponding informance of brief note, text for different fonts, different form types；

2. a pair generated table adds noise, guarantee the robustness of model；

3. sample is sent to training in joint training model；

4. the model after being trained is identified for OCR.

As shown in Fig. 2, the present invention provides a kind of device for handling form Image center selection word, comprising: preprocessing module 1, identification module 2 and table recovery module 3.

Preprocessing module 1, for removing the disturbance ecology item in form Image.

Preprocessing module pre-processes form Image, to remove the disturbance ecology item in form Image, to tabular drawing The pretreatment that piece carries out includes picture angle correction operation or the removal operation of watermark seal.

Identification module 2 selects brief note in form Image centre circle, obtains brief note in form Image using joint training model In brief note position coordinates and identify word content corresponding with brief note.

The implementation steps of identification module are as follows:

RGB three-channel processing is carried out to the form Image of removal disturbance ecology item, to form at least two tabular drawing lamellas；

Table recovery module 3 carries out table reduction to brief note, brief note position coordinates and word content using table characteristic.

Wherein, table recovery module carries out the cutting of table row and grid column according to brief note position coordinates, by word content It imports in brief note position, the combination of cell is carried out according to Semantic judgement, to complete the reduction of whole table.

It further include that module 4 is presented, the table after restoring for rendering.

Whole CTPN model of the joint training model based on deep learning, while creative on CTPN model connecing Enter CTC and identify the feature around brief note, so that identification process is can use table characteristic, accuracy is substantially improved.

The foregoing is merely presently preferred embodiments of the present invention, is merely illustrative and not restrictive for the invention. Those skilled in the art understand that many changes can be carried out in the spirit and scope defined by invention claim to it, modify, It is even equivalent, but fall in protection scope of the present invention.

Claims

1. a kind of method for handling form Image center selection word, comprising the following steps:

Step 1, to remove the disturbance ecology item in form Image；

Step 2, using joint training model, select brief note in form Image centre circle, obtain brief note of the brief note in form Image Position coordinates simultaneously identify word content corresponding with brief note；

2. a kind of method for handling form Image center selection word according to claim 1, which is characterized in that in step 1 In, form Image is pre-processed, to remove the disturbance ecology item in form Image, to the pretreatment packet of form Image progress Include picture angle correction operation or the removal operation of watermark seal.

3. a kind of method for handling form Image center selection word according to claim 1, which is characterized in that in step 2 In, including following sub-step:

Step 21 carries out RGB three-channel processing to the form Image of removal disturbance ecology item, to form at least two form Images Layer；

Step 24, in the second tabular drawing lamella, text corresponding with brief note is obtained by image information and applicational language model Word content.

4. a kind of method for handling form Image center selection word according to claim 3, which is characterized in that in step 23 In, brief note position coordinates include top-left coordinates (x0, y0), upper right coordinate (x1, y1), lower right coordinate (x2, y2), lower-left coordinate (x3, y3).

5. a kind of method for handling form Image center selection word according to claim 1, which is characterized in that in step 3 In, the cutting of table row and grid column is carried out according to brief note position coordinates, word content is imported in brief note position, according to semanteme Judgement carries out the combination of cell, to complete the reduction of whole table.

6. a kind of method for handling form Image center selection word according to claim 1, which is characterized in that further include step Rapid 4, the table after reduction is presented.

7. a kind of device of the method for processing form Image center selection word described in a kind of implementation claim 1, feature It is, comprising: preprocessing module, identification module and table recovery module；

Identification module selects brief note in form Image centre circle, obtains word of the brief note in form Image using joint training model Position coordinates simultaneously identify word content corresponding with brief note；

8. device according to claim 7, which is characterized in that the pretreatment that the preprocessing module carries out form Image Including picture angle correction operation or the removal operation of watermark seal.

9. device according to claim 7, which is characterized in that the implementation steps of the identification module are as follows:

In the second tabular drawing lamella, word content corresponding with brief note is obtained by image information and applicational language model.

10. device according to claim 7, which is characterized in that the table recovery module according to brief note position coordinates into The cutting of row table row and grid column imports word content in brief note position, and the combination of cell is carried out according to Semantic judgement, To complete the reduction of whole table.