CN115035541A - Large-size complex pdf engineering drawing text detection and identification method - Google Patents

Large-size complex pdf engineering drawing text detection and identification method Download PDF

Info

Publication number
CN115035541A
CN115035541A CN202210735421.3A CN202210735421A CN115035541A CN 115035541 A CN115035541 A CN 115035541A CN 202210735421 A CN202210735421 A CN 202210735421A CN 115035541 A CN115035541 A CN 115035541A
Authority
CN
China
Prior art keywords
text
image
sub
pdf
resolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210735421.3A
Other languages
Chinese (zh)
Inventor
姚昊
潘炼
伍吉泽
李武平
沈祯杰
刘忠良
李清
熊伟
张永兴
李强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CNNC Nuclear Power Operation Management Co Ltd
Original Assignee
CNNC Nuclear Power Operation Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CNNC Nuclear Power Operation Management Co Ltd filed Critical CNNC Nuclear Power Operation Management Co Ltd
Priority to CN202210735421.3A priority Critical patent/CN115035541A/en
Publication of CN115035541A publication Critical patent/CN115035541A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/42Document-oriented image-based pattern recognition based on the type of document
    • G06V30/422Technical drawings; Geographical maps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/1444Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)

Abstract

The invention provides a text detection and identification method for a large-size complex pdf engineering drawing, which comprises the following steps: step S1: preprocessing pdf engineering drawings to generate corresponding high-resolution images; step S2: cutting the high-resolution image into a plurality of low-resolution subgraphs, and recording the corresponding sequence of the subgraphs according to the positions; step S3: carrying out first sub-image text detection, preliminarily positioning a text region range in the sub-image, and outputting position coordinates corresponding to the range; step S4: mapping the position coordinates of the text regions in the sub-image to the original large image, removing repeated data in the large image, and acquiring corresponding text region images according to the position coordinates after the repetition removal; step S5: performing second text detection, accurately positioning the text in the text area, and cutting a corresponding text block; step S6: and performing text recognition on the text block, and extracting text content in the text block and a corresponding coordinate position. The method provided by the invention improves the text recognition accuracy of the complex drawing.

Description

Large-size complex pdf engineering drawing text detection and identification method
Technical Field
The invention relates to the technical field of text drawing management of nuclear power plants, in particular to a text detection and identification method for large-size complex pdf engineering drawings.
Background
In the engineering field, a relationship between a drawing and text contents thereof is often required to be established so as to quickly query information such as material codes, component numbers and the like in the drawing and the corresponding drawing. In the past, most of the work is realized by manual means, the efficiency is low, and the cost of manpower resources is extremely high under the condition of processing text data of a large number of drawings. Therefore, a method for automatically identifying the text content of the drawing is needed to replace manual work, so that text extraction of a large amount of pdf drawings is completed, the labor cost is reduced, and the text extraction efficiency of engineering drawings is improved.
Currently, text recognition for drawings generally requires two steps: text detection and text recognition. The text detection aims at detecting a text area in a drawing, realizing the positioning of a text in an image and outputting position coordinate information corresponding to the text area; the text recognition aims at outputting corresponding texts aiming at text areas in drawings.
Therefore, the problems of high cost, low efficiency, complex drawing content and the like exist in the conventional drawing text extraction method.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a method for detecting and identifying the text of a large-size complex pdf engineering drawing, which is low in cost and high in efficiency.
In order to achieve the above purpose, the invention provides the following technical scheme:
a method for detecting and identifying a text of a large-size complex pdf engineering drawing comprises the following steps:
step S1: preprocessing pdf engineering drawings to generate corresponding high-resolution images;
step S2: cutting the high-resolution image into a plurality of low-resolution subgraphs, and recording the corresponding sequence of the subgraphs according to the positions;
step S3: carrying out first sub-image text detection, preliminarily positioning a text region range in the sub-image, and outputting position coordinates corresponding to the range;
step S4: mapping the position coordinates of the text regions in the sub-image to the original large image, removing repeated data in the large image, and acquiring corresponding text region images according to the position coordinates after the repetition removal;
step S5: performing second text detection, accurately positioning the text in the text area, and cutting a corresponding text block;
step S6: and performing text recognition on the text block, and extracting text content in the text block and a corresponding coordinate position.
In step S2, the high resolution image is cut into several low resolution sub-images by using sliding window cropping.
In step S3, the text detection of the sub-image is completed by using an advanced east method, and the rough position information of the text region in the sub-image is preliminarily obtained.
Step S4 includes:
step S41: mapping the coordinate position in the step S3 to the original high-resolution large image;
step S42: removing repeated data in the coordinate information;
step S43: and cutting the corresponding text area image according to the position coordinates after the duplication removal.
In step S5, the text region image obtained in step S4 is subjected to a second text detection, the text is accurately positioned, and a corresponding text image is cut out.
In step S6, a PaddleOCR text recognition scheme is used to complete text recognition of the text image obtained in step S5, and finally the text content and the corresponding image area coordinates are output.
Compared with the prior art, the text detection and identification method for the large-size complex pdf engineering drawing provided by the invention has the following beneficial effects:
the method provided by the invention can accurately detect the effective text area in the large-size complex PDF engineering drawing, including the coordinate information of the transverse text area and the coordinate information of the vertical text area, and accurately identify the text content in the effective text area.
In addition, through two continuous text detections, the adverse effect of interference of lines, patterns and the like on recognition is effectively avoided, and the text recognition accuracy of the complex drawings is improved.
Furthermore, the text detection and identification method is applied to large-size drawings in a sliding window blocking processing mode, and meanwhile the risk of continuous text interception is avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a text detection and identification method for a large-size complex pdf engineering drawing according to an embodiment of the present invention.
Detailed Description
The following is a detailed description of the preferred embodiments.
The invention provides a text detection and identification method for a large-size complex pdf engineering drawing, which can be generally divided into four parts: firstly, processing PDF drawing, converting the PDF drawing into a high-resolution image, and orderly splitting the image into sub-images with fixed sizes. And secondly, performing text detection twice, and accurately positioning a text area. The method comprises the steps of detecting a subgraph for the first time, and finding out a rough region range with a text in the subgraph; and the second detection aims at the area detected for the first time, eliminates the interference existing in the area and accurately positions the text position. Processing text coordinates, mapping coordinates in the subgraph to a high-resolution big graph, and screening out repeated data in the coordinates; and fourthly, text recognition, namely recognizing the text content of the corresponding area according to the text detection result, and outputting the text content and the corresponding coordinate position thereof.
As shown in fig. 1, the method for detecting and identifying the text of the large-size complex pdf engineering drawing provided by the invention comprises the following steps:
step S1: preprocessing a pdf engineering drawing to generate a corresponding high-resolution image, such as a ten-million pixel level image of 3680x 2944;
step S2: and cutting the high-resolution image into a plurality of subgraphs with smaller sizes by using a sliding window cutting method, and recording corresponding sequence numbers of the subgraphs according to the transverse and longitudinal sliding times i and j of the cutting window. The specific method comprises the following steps: for the large graph of 3680x2944, each subgraph I i,j The width w and the height h of the sliding frame are both 736, the horizontal sliding step length delta x and the longitudinal sliding step length delta y are both 368, and finally 63 sub-graphs are obtained;
step S3: completing text detection of the subgraph by using an advanced east method, and primarily acquiring rough position information of a text region in the subgraph, wherein the rough position information is specifically represented as four vertexes of a rectangular text region and corresponds to 8 coordinate values (x) 0 ,y 0 )…(x 3 ,y 3 );
Step S4: mapping the position coordinates of the text region in the sub-graph to the original large graph, removing repeated data in the large graph, and acquiring a corresponding text region image according to the position coordinates after the repeated data are removed;
step S41: and mapping the coordinate position in the step S3 to the original high-resolution large graph, wherein the coordinate mapping formula is as follows:
X m =i*Δx+x m ,m=0,1,2,3;
Y n =j*Δy+y n ,n=0,1,2,3;
step S42: and removing repeated data in the coordinate information. Since the sub-graph is obtained by clipping using the sliding window in step S2, there is a case that the same text region is detected multiple times in the detection, so that multiple sets of coordinate information pointing to the same region in the original graph are obtained, and these repeated data need to be merged into one set of coordinate data. The repeated data merging judgment formula is as follows:
Figure BDA0003715147000000051
wherein S is i And representing the text area, if the text detection area contains the situation, combining the coordinates of the text area, and discarding the coordinates of the smaller area.
Step S43: and cutting the corresponding text area image according to the position coordinates after the duplication removal.
Step S5: and carrying out secondary text detection on the text area image subjected to the primary text detection, accurately positioning the text, and cutting out a corresponding text image. The secondary detection can effectively remove the interference of lines or patterns except the text content in the text region detected for the first time, realize more accurate text positioning and ensure the accuracy of subsequent identification.
Step S6: and performing text recognition on the accurate text region obtained by text detection by using a PaddleOCR text recognition scheme. Finally, the text content and the corresponding image area coordinate position are output.
The text detection method adopts an advanced EAST open source text detection scheme, takes a VGG16 network structure as a main network to extract pixel characteristics in a drawing, realizes multi-channel characteristic fusion by using modes of upsampling, convolution and the like, and predicts a text region according to the fusion characteristics. The text recognition portion uses the PaddleOCR open source text recognition scheme based on the CRNN model using CTC Loss as a Loss function.
The invention provides an application type basic technology, solves the problems of text detection and identification of PDF engineering drawings in scenes with large size (note: the whole PDF drawing cannot be directly used as an input source) and complex content (note: interference lines or patterns of horizontal texts, vertical texts and similar texts exist), and can provide technical support for relevant application of specific texts in large-size complex PDF engineering drawings, such as: identification of codes of device codes or material codes, code error correction recommendation, code location query, code file association, and the like.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (6)

1. A text detection and identification method for a large-size complex pdf engineering drawing is characterized by comprising the following steps:
step S1: preprocessing a pdf engineering drawing to generate a corresponding high-resolution image;
step S2: cutting the high-resolution image into a plurality of low-resolution subgraphs, and recording the corresponding sequence of the subgraphs according to the positions;
step S3: carrying out first sub-image text detection, preliminarily positioning a text region range in the sub-image, and outputting position coordinates corresponding to the range;
step S4: mapping the position coordinates of the text regions in the sub-image to the original large image, removing repeated data in the large image, and acquiring corresponding text region images according to the position coordinates after the repetition removal;
step S5: performing second text detection, accurately positioning the text in the text area, and cutting a corresponding text block;
step S6: and performing text recognition on the text block, and extracting text content in the text block and a corresponding coordinate position.
2. The method for detecting and recognizing text in large-sized complex pdf engineering drawing according to claim 1, wherein in step S2, the high-resolution image is cut into several low-resolution sub-images by using sliding window cropping.
3. The method for detecting and recognizing the text of the large-sized complex pdf engineering drawing according to claim 1, wherein in step S3, the text detection of the sub-graph is completed by using an advanced east method, and rough position information of the text region in the sub-graph is obtained preliminarily.
4. The method for detecting and identifying the text of the large-size complex pdf engineering drawing according to claim 1, wherein the step S4 comprises:
step S41: mapping the coordinate position in the step S3 to the original high-resolution large image;
step S42: removing repeated data in the coordinate information;
step S43: and cutting the corresponding text area image according to the position coordinates after the duplication removal.
5. The method for detecting and recognizing text in large-sized complex pdf engineering drawing according to claim 1, wherein in step S5, the text region image obtained in step S4 is subjected to a second text detection, the text is precisely located, and the corresponding text image is cut out.
6. The method for detecting and recognizing text in large-sized complex pdf engineering drawing according to claim 1, wherein in step S6, using PaddleOCR text recognition scheme, completing text recognition of the text image obtained in step S5, and finally outputting the text content and the corresponding image area coordinates.
CN202210735421.3A 2022-06-27 2022-06-27 Large-size complex pdf engineering drawing text detection and identification method Pending CN115035541A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210735421.3A CN115035541A (en) 2022-06-27 2022-06-27 Large-size complex pdf engineering drawing text detection and identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210735421.3A CN115035541A (en) 2022-06-27 2022-06-27 Large-size complex pdf engineering drawing text detection and identification method

Publications (1)

Publication Number Publication Date
CN115035541A true CN115035541A (en) 2022-09-09

Family

ID=83126782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210735421.3A Pending CN115035541A (en) 2022-06-27 2022-06-27 Large-size complex pdf engineering drawing text detection and identification method

Country Status (1)

Country Link
CN (1) CN115035541A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118570066A (en) * 2024-08-02 2024-08-30 中国科学技术大学 Text image super-resolution method and system based on mask reconstruction paradigm

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190266394A1 (en) * 2018-02-26 2019-08-29 Abc Fintech Co., Ltd. Method and device for parsing table in document image
CN111414906A (en) * 2020-03-05 2020-07-14 北京交通大学 Data synthesis and text recognition method for paper bill picture
CN111860348A (en) * 2020-07-21 2020-10-30 国网山东省电力公司青岛供电公司 Deep learning-based weak supervision power drawing OCR recognition method
CN112069985A (en) * 2020-08-31 2020-12-11 华中农业大学 High-resolution field image rice ear detection and counting method based on deep learning
CN112633277A (en) * 2020-12-30 2021-04-09 杭州电子科技大学 Channel ship board detection, positioning and identification method based on deep learning
CN113269049A (en) * 2021-04-30 2021-08-17 天津科技大学 Method for detecting handwritten Chinese character area
WO2021190171A1 (en) * 2020-03-25 2021-09-30 腾讯科技(深圳)有限公司 Image recognition method and apparatus, terminal, and storage medium
CN113569629A (en) * 2021-06-11 2021-10-29 杭州玖欣物联科技有限公司 Model method for extracting key information and desensitizing sensitive information of machining drawing
CN114140803A (en) * 2022-01-30 2022-03-04 杭州实在智能科技有限公司 Document single word coordinate detection and correction method and system based on deep learning
CN114170608A (en) * 2021-12-01 2022-03-11 上海东普信息科技有限公司 Super-resolution text image recognition method, device, equipment and storage medium
CN114220091A (en) * 2021-12-16 2022-03-22 广东电网有限责任公司 Image text detection method and system based on fast Rcnn

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190266394A1 (en) * 2018-02-26 2019-08-29 Abc Fintech Co., Ltd. Method and device for parsing table in document image
CN111414906A (en) * 2020-03-05 2020-07-14 北京交通大学 Data synthesis and text recognition method for paper bill picture
WO2021190171A1 (en) * 2020-03-25 2021-09-30 腾讯科技(深圳)有限公司 Image recognition method and apparatus, terminal, and storage medium
CN111860348A (en) * 2020-07-21 2020-10-30 国网山东省电力公司青岛供电公司 Deep learning-based weak supervision power drawing OCR recognition method
CN112069985A (en) * 2020-08-31 2020-12-11 华中农业大学 High-resolution field image rice ear detection and counting method based on deep learning
CN112633277A (en) * 2020-12-30 2021-04-09 杭州电子科技大学 Channel ship board detection, positioning and identification method based on deep learning
CN113269049A (en) * 2021-04-30 2021-08-17 天津科技大学 Method for detecting handwritten Chinese character area
CN113569629A (en) * 2021-06-11 2021-10-29 杭州玖欣物联科技有限公司 Model method for extracting key information and desensitizing sensitive information of machining drawing
CN114170608A (en) * 2021-12-01 2022-03-11 上海东普信息科技有限公司 Super-resolution text image recognition method, device, equipment and storage medium
CN114220091A (en) * 2021-12-16 2022-03-22 广东电网有限责任公司 Image text detection method and system based on fast Rcnn
CN114140803A (en) * 2022-01-30 2022-03-04 杭州实在智能科技有限公司 Document single word coordinate detection and correction method and system based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李昊 等: "基于深度学习和图匹配的接线图检测与校核", 北京航空航天大学学报, vol. 47, no. 3, 2 November 2020 (2020-11-02) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118570066A (en) * 2024-08-02 2024-08-30 中国科学技术大学 Text image super-resolution method and system based on mask reconstruction paradigm

Similar Documents

Publication Publication Date Title
CN111814722B (en) Method and device for identifying table in image, electronic equipment and storage medium
CN110363102B (en) Object identification processing method and device for PDF (Portable document Format) file
US8401333B2 (en) Image processing method and apparatus for multi-resolution feature based image registration
RU2651144C2 (en) Data input from images of the documents with fixed structure
US9230382B2 (en) Document image capturing and processing
CN104751142B (en) A kind of natural scene Method for text detection based on stroke feature
CN110781877B (en) Image recognition method, device and storage medium
Chen et al. Shadow-based Building Detection and Segmentation in High-resolution Remote Sensing Image.
CN111626145B (en) Simple and effective incomplete form identification and page-crossing splicing method
CN105139042A (en) Image identification method and system
CN112883926B (en) Identification method and device for form medical images
CN110321750A (en) Two-dimensional code identification method and system in a kind of picture
CN116052193B (en) RPA interface dynamic form picking and matching method and system
CN112016481A (en) Financial statement information detection and identification method based on OCR
CN116311259B (en) Information extraction method for PDF business document
CN113688688A (en) Completion method of table lines in picture and identification method of table in picture
CN115035541A (en) Large-size complex pdf engineering drawing text detection and identification method
US8897538B1 (en) Document image capturing and processing
CN109635729B (en) Form identification method and terminal
CN101425143A (en) Image positioning method and device
CN112364863B (en) Character positioning method and system for license document
CN112232390B (en) High-pixel large image identification method and system
CN117218672A (en) Deep learning-based medical records text recognition method and system
CN115861922B (en) Sparse smoke detection method and device, computer equipment and storage medium
CN101901333A (en) Method for segmenting word in text image and identification device using same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination