CN111401312B

CN111401312B - PDF drawing text recognition method, system and equipment

Info

Publication number: CN111401312B
Application number: CN202010278085.5A
Authority: CN
Inventors: 张东锋; 曾雏鹏; 李俊波
Original assignee: Shenzhen Xinzhi Software Co ltd
Current assignee: Shenzhen Xinzhi Software Co ltd
Priority date: 2020-04-10
Filing date: 2020-04-10
Publication date: 2024-04-26
Anticipated expiration: 2040-04-10
Also published as: CN111401312A

Abstract

The invention provides a PDF drawing text recognition method, a system and equipment, wherein the PDF drawing text recognition method comprises the following steps: performing an optical character recognition step based on the deep learning; a customized recognition and general recognition step; a mobile device low-quality image recognition step; wherein the performing optical character recognition step based on the deep learning includes the steps of: detecting a region with characters in a scene and identifying the characters in the region, wherein text detection is performed based on CTPN, seglink, textBoxes, FTSN, pixellink and a CRATT algorithm; the character recognition is carried out based on CNN and CRNN algorithms; wherein the customizing identifying step comprises the following steps: identifying the type of the PDF drawing according to the table characters in the PDF or the frame content in the PDF; extracting content in the region according to the structural features; and extracting the key area, and identifying the characters in the area or extracting the key characters through the deep neural network.

Description

PDF drawing text recognition method, system and equipment

Technical Field

The present invention relates to the field of image processing, and in particular, to a PDF drawing text recognition method, system, and device.

Background

Artificial intelligence has been rapidly developed in terms of data, algorithms and computational power, and has come to a new round of development waves under the large background of global economic digitization transformation. The influence of the artificial intelligence wave is far beyond, and the most remarkable characteristic is that the influence is spread from the professional field to the popular field.

PDF high-precision recognition is a well-established technology in the market today, and a method based on conventional OCR and deep learning is also applied to various industries. Recognition of bank notes, PDF form recognition, and industrial drawing recognition are all widely and well-established techniques. The recognition of formatted and templated PDFs achieves remarkable results in precision and speed, thereby improving the working efficiency and capacity of practitioners in various industries.

The traditional PDF identifies the PDF in a basic fixed form, and has certain requirements on PDF quality. With the popularization of smart phones, the traditional method has not good solution for low-quality PDF images shot by personal mobile phones.

While most PDF identifications are currently whole identifications, no extraction and identification for a specific area is provided. For some structured PDF drawings, it is also a feature of the present solution to extract a region of interest (POI) of a user and parse the content therein.

Disclosure of Invention

The invention aims to provide a PDF drawing text recognition method, a PDF drawing text recognition system and PDF drawing text recognition equipment, provides various recognition schemes such as customization, general scenes and the like, and can provide a solution for recognizing low-quality images by users based on a recent OCR algorithm of deep learning and a corresponding image processing technology.

The invention further aims to provide a PDF drawing character recognition method, a PDF drawing character recognition system and PDF drawing character recognition equipment, which can be used for recognizing various scenes such as industrial drawings, notes, images shot by personal equipment and the like, and solving different requirements of different users.

In order to achieve at least one of the above objects, the present invention provides a PDF drawing text recognition method, which includes the following steps:

performing an optical character recognition step based on the deep learning; a customized recognition and general recognition step; a mobile device low-quality image recognition step;

Wherein the performing optical character recognition step based on deep learning includes the steps of: detecting a region with characters in a scene and identifying the characters in the region, wherein text detection is performed based on CTPN, seglink, textBoxes, FTSN, pixellink and a CRATT algorithm; the character recognition is carried out based on CNN and CRNN algorithms;

Wherein the customizing identifying step comprises the following steps: identifying the type of the PDF drawing according to the table characters in the PDF or the frame content in the PDF; extracting content in the region according to the structural features; and extracting the key area, and identifying the characters in the area or extracting the key characters through the deep neural network.

In some embodiments, wherein the step of extracting key regions of the step of customizing identifying further comprises the steps of:

extracting a key region according to the proportion of the POI;

Extracting all frames by Hough transformation and corner detection;

extracting a key region according to fuzzy matching and accurate matching of the characters;

extracting a key region according to the edge characteristics of the region; and

And extracting the key region according to the Chinese character characteristic in the region.

In some embodiments, in the step of extracting the key region, the key region is extracted according to edge characteristics of shape or symmetry or angle or edge granularity of the region, wherein the key region is extracted according to characteristics of font or size or type of text in the region.

In some embodiments, wherein the mobile device low quality image recognition step further comprises the steps of:

Performing a filtering process on the image;

Performing image enhancement processing on the image;

Performing an image edge sharpening process on the image;

performing an image texture analysis process on the image;

performing image segmentation processing on the image;

Performing a geometric analysis process on the image;

Performing image matching processing on the image; and

Morphological processing is performed on the image.

In some embodiments, the filtering the image is performed in order to smooth the image and reduce noise of the image; wherein the performing image texture analysis processing on the image is performing de-skeletons and connectivity processing on the image; wherein the step of performing image matching processing on the image is to perform template matching and search matching processing on the image; wherein the morphological processing step is to perform expansion, corrosion and opening/closing operation processing on the image.

According to another aspect of the present invention, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the PDF drawing text recognition method.

According to another aspect of the present invention, there is also provided a PDF drawing text recognition apparatus, including: a software application, a memory for storing the software application, and a processor for executing the software application; wherein each program of the software application program correspondingly executes the steps in the PDF drawing text recognition method.

According to another aspect of the present invention, there is also provided a PDF drawing text recognition system, including an optical character recognition unit, a customized recognition and general purpose recognition unit, and a mobile device low-quality image recognition unit, wherein the optical character recognition unit includes a text detection module and a text recognition module, wherein the text detection module is configured to: detecting a region with characters in a scene, and executing text detection based on CTPN, seglink, textBoxes, FTSN, pixellink and a CRATE algorithm; wherein the text recognition module is configured to: identifying the characters in the detected region, and identifying the characters based on CRNN and CNN algorithms; the customized recognition and general recognition unit is provided with a customized recognition module, and the customized recognition module is configured to structure contents in the feature extraction area and recognize characters in the area or extract key characters through a deep neural network; wherein the mobile device low quality image recognition unit is configured to: filtering, enhancing, sharpening, texture analysis, segmentation, geometry analysis, matching, and morphology of the identified image

In some embodiments, the key region extraction module further includes a POI scale extraction module, a hough transform corner detection extraction module, a text blurring and exact matching extraction module, a region edge characteristic extraction module, and a region text characteristic extraction module; wherein the POI scale extraction module is configured to extract a key region according to the POI scale size; the Hough transformation corner detection and extraction module is configured to extract all frames by Hough transformation and corner detection; the text fuzzy and exact match extraction module is configured to extract a key region according to fuzzy match and exact match of text; the region edge characteristic extraction module is configured to extract a key region according to the edge characteristic of the region; the region text feature extraction module is configured to extract a key region based on the text feature in the region.

In some embodiments, the edge characteristics of the region edge characteristics extraction module are shape, symmetry, angle, and edge granularity, and the Chinese characteristics of the region text characteristics extraction module are font, size, and text type.

Drawings

Fig. 1 is a flowchart of steps of a PDF drawing text recognition method according to an embodiment of the present invention.

Fig. 2 is a schematic structural diagram of a PDF drawing text recognition system according to an embodiment of the present invention.

Detailed Description

The following description is presented to enable one of ordinary skill in the art to make and use the invention. The preferred embodiments in the following description are by way of example only and other obvious variations will occur to those skilled in the art. The basic principles of the invention defined in the following description may be applied to other embodiments, variations, modifications, equivalents, and other technical solutions without departing from the spirit and scope of the invention.

It will be understood that the terms "a" and "an" should be interpreted as referring to "at least one" or "one or more," i.e., in one embodiment, the number of elements may be one, while in another embodiment, the number of elements may be plural, and the term "a" should not be interpreted as limiting the number.

The present invention relates to computer programs. Fig. 1 is a flowchart of a PDF drawing text recognition method according to the present invention, and illustrates a solution for controlling or processing a computer external object or an internal object by executing a computer program programmed according to the above-mentioned process on the basis of a computer program processing flow. The PDF drawing character recognition method can utilize a computer system, integrate manual experience and machine learning results, can be used for recognizing various scenes such as industrial drawings, notes, images shot by personal equipment and the like, and can solve different requirements of different users, and it can be understood that the computer is not only a desktop computer, a notebook computer, a tablet and the like, but also other intelligent electronic equipment capable of operating according to programs and processing data.

As shown in fig. 1, the PDF drawing text recognition method includes the following steps:

s10: the optical character recognition step is performed based on deep learning.

The optical character recognition step includes the steps of:

Detecting a region with characters in a scene, wherein text detection is performed based on CTPN, seglink, textBoxes, FTSN, pixellink and a CRATT algorithm; and

And recognizing characters in the detected region, wherein the characters are recognized based on CRNN and CNN algorithms.

Further, the PDF drawing text recognition method further includes step S20: customized recognition and generic recognition steps.

The customizing and universal identifying steps comprise the following steps:

A customized identification step; and

And (3) a general identification step.

Wherein the customized identification step further comprises the steps of:

Identifying the type of the PDF drawing according to the table characters in the PDF or the frame content in the PDF;

extracting content in the region according to the structural features; and

And extracting a key area, and identifying characters in the area or extracting key characters through a deep neural network.

Wherein the step of extracting the key region further comprises the steps of:

Extracting according to the proportion of the POI;

Extracting all frames by Hough transformation and corner detection;

fuzzy matching and accurate matching of characters;

According to the edge characteristics of the region, such as shape, symmetry, angle, edge granularity, etc.; and

Based on character characteristics in the region, such as font, size, text type, etc.

Further, the PDF drawing text recognition method further includes step S30: a mobile device low quality image recognition step.

The mobile device low quality image recognition step further comprises the steps of:

Filtering, such as image smoothing, image denoising;

Enhancing the image;

Sharpening the image edge;

image texture analysis, e.g., debonding, connectivity;

Dividing an image;

Geometric analysis;

image matching, e.g., template matching, search matching; and

Morphological treatments such as expansion, corrosion, opening and closing operations, etc.

As an enterprise provides an identification portal, the ability to provide an individual user with the ability to take PDF drawings and identify. However, the image shot by the user often has poor quality due to illumination conditions, shooting angles, and the like. The PDF drawing character recognition method can improve the quality of images, thereby improving the recognition precision. Aiming at the low-quality image provided by the personal equipment provided by the user, the image can have better expressive force after being processed by the step of identifying the low-quality image of the mobile equipment, and the image quality approximates to a high-precision PDF, so that an identification algorithm can be better carried out. But also expands the applicability and generalization of the overall algorithm.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided in the form of a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.

It will be appreciated by those skilled in the art that the PDF drawing text recognition method of the present invention may be implemented by hardware, software, or a combination of hardware and software. The invention may be implemented in a centralized fashion in at least one computer system or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods is suited. The combination of hardware and software may be a general-purpose computer system with a computer program installed thereon, and the computer system may be controlled to operate according to the method by installing and executing the program.

The present invention can be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein. The computer program product is embodied in one or more computer-readable storage media having computer-readable program code embodied therein. According to another aspect of the invention there is also provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, is capable of performing the steps of the method of the invention. Computer storage media is the medium in computer memory that stores some discrete physical quantity. Computer storage media includes, but is not limited to, semiconductors, disk storage, magnetic cores, drums, tapes, laser disks, and the like. It will be appreciated by those skilled in the art that the computer storage media is not limited to the foregoing examples, which are provided by way of example only and are not limiting of the invention.

A typical combination of hardware and software could be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods disclosed herein. According to another aspect of the present invention, there is also provided a PDF drawing text recognition apparatus, including: a software application, a memory for storing the software application, and a processor for executing the software application. Each program of the software application program can correspondingly execute the steps in the PDF drawing text recognition method.

Corresponding to the embodiment of the method, according to another aspect of the invention, a PDF drawing text recognition system is also provided, and the PDF drawing text recognition system is the application of the PDF drawing text recognition method in the improvement of computer programs.

As shown in fig. 2, in this embodiment of the present invention, the PDF drawing text recognition system includes an optical character recognition unit 100, a customized recognition and general recognition unit 200, and a mobile device low-quality image recognition unit 300.

Specifically, the optical character recognition unit 100 includes a text detection module 110 and a recognition module 120. Wherein the text detection module 110 detects areas of text in the scene, preferably performs text detection based on CTPN, seglink, textBoxes, FTSN, pixellink and a CRAFT algorithm. The text recognition module 120 recognizes the text in the detected region, preferably based on CRNN and CNN algorithms. Those skilled in the art will appreciate that in other embodiments of the present invention, other algorithms may be used in addition to CRNN, CNN, etc., and the present invention is not limited in this respect.

With the development of neural networks in computer vision, the precision of Optical Character Recognition (OCR) has been greatly improved compared with the conventional technology. Under the deep learning large background, the character recognition is expanded from the recognition of the traditional scene to the character recognition of the general scene, namely the character recognition of the natural scene. The algorithm model of the optical character recognition unit ensures that the recognition accuracy of the user is guaranteed to the greatest extent possible. In addition, the speed and precision may be different for different users, so the PDF drawing text recognition system adapts to various algorithms according to different scenes and users, and the text recognition module 120 replaces some complex text recognition algorithms (such as CRNN) with CNNs, so that recognition schemes with a faster speed can be provided as far as possible in the face of simple scenes.

The customized recognition and general recognition unit 200 is provided with a customized recognition module 210, and the customized recognition module 210 is configured to structure contents within the feature extraction area and recognize characters in the area or extract key characters through a deep neural network. Preferably, the customized recognition module 210 is provided with a key region extraction module 220, and the key region extraction module 220 further includes a POI scale extraction module 221, a hough transform corner detection extraction module 222, a text blur and exact match extraction module 223, a region edge characteristic extraction module 224, and a region text characteristic extraction module 225. The POI proportion extraction module 221 extracts according to the POI proportion; the hough transform corner detection extraction module 222 extracts all frames by using hough transform and corner detection; the text fuzzy and exact match extraction module 223 extracts the content in the key region according to the fuzzy match and exact match of the text; the region edge feature extraction module 224 extracts content in the key region, such as shape, symmetry, angle, and edge granularity, according to the edge feature of the region; the regional text feature extraction module 225 extracts content, such as font, size, text type, etc., in the key region based on the text feature in the region.

The mobile device low-quality image recognition unit 300 is configured to perform a filtering process, an image enhancement process, an image edge sharpening process, an image texture analysis process, an image segmentation process, a geometric morphology analysis process, an image matching process, and a morphology process on the recognized image.

As an enterprise provides an identification portal, the ability to provide an individual user with the ability to take PDF drawings and identify. However, the image shot by the user often has poor quality due to illumination conditions, shooting angles, and the like. The mobile device low-quality image recognition unit 300 adopts various image processing schemes to improve the quality of the image, thereby improving the recognition accuracy.

Aiming at the low-quality image provided by the personal equipment provided by the user, the image can have better expressive force after being processed by the PDF drawing character recognition system, and the image quality approximates to a high-precision PDF, so that the recognition algorithm can be better carried out, and the application scene and generalization of the whole algorithm can be expanded.

It will be appreciated by persons skilled in the art that the present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, systems and computer program products according to the invention. Each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart and/or block diagram block or blocks.

It will be appreciated by persons skilled in the art that the embodiments of the invention described above and shown in the drawings are by way of example only and are not limiting. The objects of the present invention have been fully and effectively achieved. The functional and structural principles of the present invention have been shown and described in the examples and embodiments of the invention may be modified or practiced without departing from such principles.

Claims

1. The PDF drawing text recognition method is characterized by comprising the following steps of:

Wherein the customizing identifying step comprises the following steps: identifying the type of the PDF drawing according to the table characters in the PDF or the frame content in the PDF; extracting content in the region according to the structural features; extracting a key area, and identifying characters in the area or extracting key characters through a deep neural network;

Wherein the step of extracting key regions of the step of customizing identification further comprises the steps of: extracting a key region according to the proportion of the POI; extracting all frames by Hough transformation and corner detection; extracting a key region according to fuzzy matching and accurate matching of the characters; extracting a key region according to the edge characteristics of the region; and

Extracting a key region according to Chinese character characteristics in the region;

In the step of extracting the key region in the step of customizing and identifying, the key region is extracted according to the shape, symmetry, angle or edge characteristic of the edge granularity of the region, wherein the key region is extracted according to the character font, size or character type characteristic of characters in the region;

wherein the mobile device low quality image recognition step further comprises the steps of: performing a filtering process on the image; performing image enhancement processing on the image; performing an image edge sharpening process on the image; performing an image texture analysis process on the image; performing image segmentation processing on the image; performing a geometric analysis process on the image; performing image matching processing on the image; performing morphological processing on the image;

Wherein the filtering processing step is to perform image smoothing and image noise reduction processing on the image; wherein the performing image texture analysis processing on the image is performing de-skeletons and connectivity processing on the image; wherein the step of performing image matching processing on the image is to perform template matching and search matching processing on the image; wherein the morphological processing step is to perform expansion, corrosion and opening/closing operation processing on the image.

2. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor performs the steps of the PDF drawing text recognition method of claim 1.

3. A PDF drawing text recognition device, characterized in that the PDF drawing text recognition device comprises: a software application, a memory for storing the software application, and a processor for executing the software application; wherein each program of the software application program correspondingly executes the steps in the PDF drawing text recognition method as claimed in claim 1.

4. A PDF drawing text recognition system comprising an optical character recognition unit, a customized recognition and universal recognition unit, and a mobile device low quality image recognition unit, wherein the optical character recognition unit comprises a text detection module and a text recognition module, wherein the text detection module is configured to: detecting a region with characters in a scene, and executing text detection based on CTPN, seglink, textBoxes, FTSN, pixellink and a CRATE algorithm; wherein the text recognition module is configured to: identifying the characters in the detected region, and identifying the characters based on CRNN and CNN algorithms; the customized recognition and general recognition unit is provided with a customized recognition module, and the customized recognition module is configured to structure contents in the feature extraction area and recognize characters in the area or extract key characters through a deep neural network; wherein the mobile device low quality image recognition unit is configured to: performing filtering processing, image enhancement processing, image edge sharpening processing, image texture analysis processing, image segmentation processing, geometric form analysis processing, image matching processing and morphological processing on the identified image; the key region extraction module further comprises a POI proportion extraction module, a Hough transformation corner detection extraction module, a text blurring and accurate matching extraction module, a region edge characteristic extraction module and a region text characteristic extraction module; wherein the POI scale extraction module is configured to extract a key region according to the POI scale size; the Hough transformation corner detection and extraction module is configured to extract all frames by Hough transformation and corner detection; the text fuzzy and exact match extraction module is configured to extract a key region according to fuzzy match and exact match of text; the region edge characteristic extraction module is configured to extract a key region according to the edge characteristic of the region; the regional text characteristic extraction module is configured to extract a key region according to the Chinese characteristic in the region; the edge characteristics of the area edge characteristic extraction module are shape, symmetry, angle and edge granularity, and the Chinese characteristics of the area character characteristic extraction module are font, size and character type; wherein the mobile device low quality image recognition unit is further configured to: the filtering process performs image smoothing and image noise reduction; the image texture analysis processing executes de-skeleton and connectivity processing; the image matching process performs template matching and search matching process; and morphological processing to perform expansion, etching, and opening and closing operations.