CN113836971A

CN113836971A - Method, system and storage medium for reproducing visual information identified by image type scanning piece

Info

Publication number: CN113836971A
Application number: CN202010580263.XA
Authority: CN
Inventors: 翟晓刚
Original assignee: China Life Insurance Asset Management Co ltd
Current assignee: China Life Insurance Asset Management Co ltd
Priority date: 2020-06-23
Filing date: 2020-06-23
Publication date: 2021-12-24
Anticipated expiration: 2040-06-23
Also published as: CN113836971B

Abstract

The invention relates to the field of document processing, and discloses a visual information reproduction method, a system and a storage medium after image type scanning piece identification, wherein a user loads a visual information recovery system of the identification content of an image type scanning piece, uploads PDFs of the image type scanning piece to be subjected to visual information reproduction, establishes a comparison table of word width pixels, line spacing pixels, font sizes of a font library and line spacing sizes in a word document, identifies the character content in all line detection areas through a visual information analysis algorithm based on computer vision and an OCR character identification technology based on deep learning, counts the number of characters in the line detection areas, calculates the average width and line spacing pixels of the characters, compares the fonts with the comparison table to obtain the sizes and line spacings, further calculates paragraph visual information, determines a paragraph head, and finally outputs an editable word document according to the font sizes, the line spacings and the paragraph visual information, the user uploaded information has confidentiality, is safe and easy to operate, and can quickly realize PDF conversion format.

Description

Method, system and storage medium for reproducing visual information identified by image type scanning piece

Technical Field

The invention relates to the field of document processing, in particular to a method and a system for reproducing visual information identified by an image type scanning piece and a storage medium.

Background

PDF (Portable Document Format), which is a common electronic file Format, has higher universality and compatibility in a multi-type operating system, and can ensure that data information is not modified or changed due to encoding types in the file transmission process, so PDF is used as a mainstream form of file information transmission. The PDF file can prevent others from accidentally touching the keyboard to modify the contents of the file, but also causes inconvenient modification and is difficult to convert other file formats.

Disclosure of Invention

The invention provides a method, a system and a storage medium for reproducing visual information identified by an image type scanning piece, aiming at the problems in the prior art.

In order to solve the technical problems, the invention provides the following technical scheme:

a method for reproducing visual information recognized by an image type scanning member, comprising the steps of:

s1: establishing a comparison table of word width pixels and line spacing pixels and font and line spacing sizes in a font library in a word document;

s2: uploading an image type scanning PDF;

s3: cutting the PDF of the image type scanning piece page by page into a picture format and preprocessing the picture;

s4: calculating the position information of the character row area of all the cut pictures of the image type scanning piece PDF by a text row detection algorithm based on deep learning and computer vision technology, namely calculating the character width pixel of each character in the row area and the start coordinate information and the end coordinate information of all the text row areas;

s5: recognizing the character contents in all the line detection areas through an OCR character recognition technology based on deep learning;

s6: calculating the number of punctuations included by characters in the trip detection area;

s7: calculating the average width pixel of the characters according to the line width and the number of the characters in the line calculated by the information of the initial coordinate and the ending coordinate of the line area of the characters, comparing the average width pixel of the characters with the character width pixel calculated in the step S4, and taking the smaller value as the final character width pixel value;

s8: substituting the word width pixel value obtained in the step S7 into a comparison table to obtain a corresponding font and a corresponding font size, and matching and corresponding the font, the font size and the position information of the line detection area;

s9: calculating line spacing pixels according to the initial coordinate information and the end coordinate information of the text line area, and bringing the line spacing pixels into a comparison table to obtain the corresponding line spacing size, and meanwhile, calculating paragraph visual information to determine whether the paragraph is a paragraph head;

s10: and outputting the editable word document according to the font, the font size, the line spacing and the paragraph visual information.

Further, the step S1 includes: and establishing corresponding relations between all character width pixels and all line spacing pixels and the sizes of common characters, character sizes and line spacings in the word.

Further, the step S2 includes: the localized encryption program is executed when the image type scanner PDF is uploaded.

Further, the step S3 includes: stamp removal, tilt correction, noise removal, and the like are used.

Further, the step S4 includes: the PDF of the image type scanning piece is a long text, the image type scanning piece picture of the long text is analyzed and processed page by page, the detection of the text line area and the line area positioning are realized, and the initial coordinate information and the ending coordinate information of each line area are analyzed and calculated.

Further, the step S9 includes: and determining line spacing pixels by calculating the line height difference between the ending coordinate of the high-order line and the starting coordinate information of the low-order line according to the starting coordinate information and the ending coordinate information of the adjacent line areas, and determining paragraph visual information by calculating the line width difference between the starting coordinate information of the high-order line and the starting coordinate information of the low-order line.

Further, the step S9 determines the segment header: approximately 2 times the width pixel of the text is calculated by calculating the line width difference of the start coordinates of the adjacent lines as S7, i.e. marked as segment head, line is two empty boxes ahead.

The invention provides a visual information reappearing method after image type scanning piece identification, which searches a comparison table of fonts, character sizes and line intervals through character width pixels of the steps S4 to S8 to obtain corresponding character sizes, calculates segment heads through the step S9 and analyzes 2 characters with the segment heads being empty.

The invention provides a visual information reappearing system after image type scanning piece identification, which is loaded on a local CPU server and is used by multiple users simultaneously, and the visual information reappearing system after image type scanning piece identification is a visual information recovery system for the identification content of the image type scanning piece, and comprises the following components:

a memory for storing executable instructions;

and the processor is used for realizing the visual information reappearing method after the image type scanning piece is identified when the processor runs the executable instructions stored in the memory.

The invention provides a computer readable storage medium, which stores executable instructions, and the executable instructions are executed by a processor to realize the visual information reproduction method after the image type scanning piece is identified.

The invention provides a visual information reproducing method, a system and a storage medium after an image type scanning piece is identified, which can realize that a plurality of users simultaneously use the system to upload a PDF file of the image type scanning piece to develop a visual information reproducing system.

Drawings

FIG. 1 is a block diagram of an embodiment of the present invention.

FIG. 2 is a flow chart of an embodiment of the present invention.

Fig. 3 is a schematic diagram of a text line region detection result according to an embodiment of the present invention.

Fig. 4 shows a PDF file and a word file before and after being processed by the visual information recovery system according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1 and fig. 2, the present invention provides a visual information reproduction system after image type scanning member identification, which is loaded on a local CPU server and is concurrently used by multiple users, and the visual information reproduction system after image type scanning member identification of the present invention is an image type scanning member identification content visual information recovery system, and the image type scanning member PDF realizes conversion of the image type scanning member PDF into an editable word document by performing a series of processing steps on the system.

The processing steps include the following:

firstly, establishing a comparison table of word width pixels and line spacing pixels and font and line spacing sizes in a font library in a word document;

and establishing corresponding relations between all character width pixels and all line spacing pixels and the sizes of common characters, character sizes and line spacings in the word.

Secondly, uploading the image type scanning PDF to a content visual information recovery system, and acquiring the image type scanning PDF by the system;

the system executes a localized encryption program during acquisition of the image-type scanner PDFs.

Thirdly, cutting the PDF of the image type scanning piece page by page into a picture format and preprocessing the picture;

preprocessing for reducing the disturbance factor of the acquisition of the image type scanner PDF obstruction is performed, and the preprocessing steps include the use of stamp removal, tilt correction, noise removal, and the like.

Then, the position information of the character line region of all the cutting pictures of the image type scanning piece PDF is calculated through a text line detection algorithm based on deep learning and computer vision technology, namely, the character width L is calculated_(i)And the start coordinate information and the end coordinate information of all text lines, including line width W and line height H;

the image type scanning piece PDF is a long text, the long text image type scanning piece PDF is analyzed and processed page by page, the text line area detection and line area positioning are realized, and the initial coordinate information and the ending coordinate information of each text line area are analyzed and calculated;

the line position information of the same line is that the initial coordinate of the text line area is marked as P₀[w,h]The ending coordinate of the text line area is marked as P₁[w,h]，P₀And P₁Are not in the same position, P₀Denotes the starting position, P, of each line of text area₁The end position of the region position of each line of characters is indicated, and the width of each line of characters is P₁Is subtracted from the width value of₀The width of each line of characters is P₀Is subtracted by P₁The height value of (1), i.e. the line width W ═ Wp₁-Wp₀High is H ═ Hp₀-Hp₁。

Then recognizing the character contents in all text line detection areas by an OCR character recognition technology based on deep learning;

OCR (Optical Character Recognition) refers to a process in which an electronic device (e.g., a scanner or a digital camera) checks a Character printed on paper, determines its shape by detecting dark and light patterns, and then translates the shape into a computer text by a Character Recognition method; the method is characterized in that characters in a paper document are converted into an image file with a black-white dot matrix in an optical mode aiming at print characters, and the characters in the image are converted into a text format through recognition software for further editing and processing by word processing software.

Further, calculating the number of punctuations included in the characters in the travel detection area, and recording the number as SUM;

calculating the average width pixel of the character according to the obtained line width W and the line character number SUM, and recording the average width pixel of the character as

Namely, it is

And the average width pixel of the character is compared with the above-mentioned counterCalculated word width L_(i)And comparing, and taking the smaller value as the final word width pixel value, namely finally determining the word width pixel value as:

further bringing the obtained word width pixel value L into a comparison table to obtain a corresponding font and a corresponding font size, and matching and corresponding the font with the position information of the line detection area;

further calculating line spacing pixels according to the text line starting coordinate information and the text line ending coordinate information and bringing the line spacing pixels into a comparison table to obtain the corresponding line spacing size, and meanwhile calculating paragraph visual information to determine whether the line spacing is a paragraph head;

the method comprises the steps of firstly determining the start coordinate information and the end coordinate information of adjacent line areas, wherein the adjacent lines comprise high-order lines and low-order lines, then determining line spacing pixels by calculating the line height difference of the start coordinate information of the high-order lines and the low-order lines, and then determining paragraph visual information by calculating the line width difference of the start coordinate information of the high-order lines and the low-order lines.

For example, as shown in FIG. 3, the start coordinate information and the end coordinate information of adjacent lines of the current text are obtained, wherein the coordinates

The start coordinate and the end coordinate of the high-order row,

the start coordinate and the end coordinate of the lower row, as can be seen from figure 3,

and

are not the same in the position of (a),

refers to the starting coordinate position of the row,

the end coordinate position of the line is designated, and the width pixel of each line of characters is

Width pixel value of (3) minus

The width pixel value of (1), the height pixel of each line of characters is

Is subtracted from the height pixel value of

A height pixel value of;

the row-pitch pixels of adjacent rows are calculated by:

the calculation result corresponds to the comparison table, and the text line spacing is determined;

calculating the width difference of the initial coordinates of the adjacent lines by the following formula to obtain the visual information of the paragraph;

from the above equation follows: the width difference between adjacent lines is calculated to be positive and the positive value may be integer or non-integer, i.e. if the width difference is approximately equal to the average width pixel of the text

2 times of that, i.e.Marked as segment head, line is two empty boxes in front.

As shown in fig. 4, the image-type scanner PDF is obtained by a text line detection algorithm based on deep learning and computer vision technology, an OCR character recognition technology based on deep learning, and a series of calculations, and finally an editable word document is output according to the font, font size, line spacing, and paragraph visual information, the text on the left side shown in fig. 4 is the image-type scanner PDF source file, and the text on the right side shown in fig. 4 is the word document content after the visual information is restored.

In the invention, a comparison table of fonts and font sizes is searched through the font width pixels of the steps S4 to S8 to obtain corresponding fonts and font sizes, a segment head is calculated through the step S9, and the average width pixels of 2 characters in the segment head space are analyzed.

The invention provides a visual information reappearing system after an image type scanning piece is identified, the system is loaded on a local CPU server and is used by multiple users simultaneously, the visual information reappearing system after the image type scanning piece is an image type scanning piece identification content visual information restoring system, and the system comprises:

a memory for storing executable instructions;

The invention also provides a computer readable storage medium, which stores executable instructions, and the executable instructions are executed by a processor to realize the visual information reproduction method after the image type scanning element is identified.

The invention loads the visual information recovery system of the identification content of the image type scanning piece on a local CPU server, establishes a comparison table of character width pixels, line spacing pixels and the font size of a font library in a word document, uploads the PDF of the image type scanning piece to be reproduced by a user, executes a security program in the process of uploading the PDF of the image type scanning piece, then preprocesses the PDF of the image type scanning piece, and adopts the system to adopt the visual information analysis and calculation based on computer visionMethod and OCR character recognition technology based on deep learning to calculate character width L_(i)And row width W and row height H are summed, the character contents in all text row areas are identified, the SUM number of characters in row detection is further calculated, and then the average width pixel of the characters is calculated

Finally determining word-wide pixels

And the font size is corresponding to the comparison table to obtain the corresponding font and font size; calculating line space pixels by acquiring the initial coordinates and the end coordinates of adjacent lines of the current text, and performing line space correspondence with a comparison table to determine the line space pixels; then calculating the paragraph visual information, calculating the width difference of the initial coordinates of the adjacent lines by the above formula to obtain the paragraph visual information calculation result, if the calculation result is about the average width pixel of the characters

About 2 times that of the line, namely marked as segment head, two empty spaces in front of the line; finally, the editable word document is output according to the font, the font size, the line spacing and the paragraph visual information, so as to achieve the aim of the invention.

The image type scanning piece identification content visual information recovery system supports multi-user concurrent uploading, the uploading is not mutually influenced, and the uploaded file has confidentiality, so that user data cannot be leaked.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for reproducing visual information recognized by an image type scanning member, comprising: the method comprises the following steps:

s2: uploading an image type scanning PDF;

2. A method for reproducing visual information recognized by an image type scanning member according to claim 1, wherein: the step S1 includes: and establishing corresponding relations between all character width pixels and all line spacing pixels and the sizes of common characters, character sizes and line spacings in the word.

3. A method for reproducing visual information recognized by an image type scanning member according to claim 1, wherein: the step S2 includes: the localized encryption program is executed when the image type scanner PDF is uploaded.

4. A method for reproducing visual information recognized by an image type scanning member according to claim 1, wherein: the step S3 preprocessing includes: seal removal, tilt correction, and noise removal.

5. A method for reproducing visual information recognized by an image type scanning member according to claim 1, wherein: the step S4 includes: the PDF of the image type scanning piece is a long text, the image type scanning piece picture of the long text is analyzed and processed page by page, the detection of the text line area and the line area positioning are realized, and the initial coordinate information and the ending coordinate information of each line area are analyzed and calculated.

6. A method for reproducing visual information recognized by an image type scanning member according to claim 1, wherein: the step S9 includes: and determining line spacing pixels by calculating the line height difference between the ending coordinate of the high-order line and the starting coordinate information of the low-order line according to the starting coordinate information and the ending coordinate information of the adjacent line areas, and determining paragraph visual information by calculating the line width difference between the starting coordinate information of the high-order line and the starting coordinate information of the low-order line.

7. A method for reproducing visual information recognized by an image type scanning member according to claim 6, wherein: the step S9 determines the segment header: the line width difference by calculating the start coordinates of the adjacent lines is about 2 times the width pixel of the text calculated in step S7, i.e., marked as the head of the segment, and the line is two empty spaces ahead.

8. A method of reproducing visual information identified by an image-type scanning element according to any one of claims 1 to 7, wherein: searching a comparison table of fonts, word sizes and line intervals through the word width pixels in the steps S4 to S8 to obtain corresponding word sizes, calculating a segment head through the step S9, and analyzing 2 characters with the segment head being empty.

9. A system for visual information reproduction after identification of an image-type scanning element, said system being loaded on a local CPU server and being for concurrent use by multiple users, characterized in that: the visual information reappearing system after the image type scanning piece is identified is an image type scanning piece identification content visual information recovery system, and comprises: a memory for storing executable instructions; a processor for executing the executable instructions stored in the memory to implement the method for reproducing the visual information identified by the image-type scanning element according to any one of claims 1 to 8.

10. A computer-readable storage medium storing executable instructions, wherein the executable instructions when executed by a processor implement the method for reproducing visual information identified by an image-type scanning element according to any one of claims 1 to 8.