CN112528832A - Method and system for processing PDF-format relay protection fixed value list - Google Patents

Method and system for processing PDF-format relay protection fixed value list Download PDF

Info

Publication number
CN112528832A
CN112528832A CN202011418226.5A CN202011418226A CN112528832A CN 112528832 A CN112528832 A CN 112528832A CN 202011418226 A CN202011418226 A CN 202011418226A CN 112528832 A CN112528832 A CN 112528832A
Authority
CN
China
Prior art keywords
relay protection
list
text
protection setting
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011418226.5A
Other languages
Chinese (zh)
Inventor
车克杉
刘可
杨�嘉
赵金朝
保积秀
王宁霞
王少飞
杨文丽
陈卉
丛贵斌
罗敏
闫涵
张真
张婧
王学斌
傅国斌
甘嘉田
丁玉杰
张�杰
宋锐
赵世昌
王轩
马勇飞
杨军
卢国强
肖明
赵东宁
杨凯璇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of State Grid Qinghai Electric Power Co Ltd
Original Assignee
Electric Power Research Institute of State Grid Qinghai Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of State Grid Qinghai Electric Power Co Ltd filed Critical Electric Power Research Institute of State Grid Qinghai Electric Power Co Ltd
Priority to CN202011418226.5A priority Critical patent/CN112528832A/en
Publication of CN112528832A publication Critical patent/CN112528832A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Character Discrimination (AREA)

Abstract

The application is applicable to the field of image data processing of a power system, and provides a method and a system for processing a PDF-format relay protection fixed value list. The method comprises the following steps: carrying out graphic and character recognition on the relay protection constant value list in a PDF format of an object form including an image through an OCR (optical character recognition), and recognizing a table and text contents in the table in the relay protection constant value list; and analyzing the identified table based on the definition of the relay protection setting value single table model library, finding out the corresponding table definition by combining the preset title row key words of the relay protection setting value single table, and obtaining the table key information of the relay protection setting value single table according to the text content in the table and combining the text and the coordinates outside the table. The method and the device can automatically identify the relay protection setting list in the PDF format containing the image, meet the application requirements related to the relay protection setting list, and can efficiently and accurately extract the key information of the relay protection setting list.

Description

Method and system for processing PDF-format relay protection fixed value list
Technical Field
The application belongs to the field of image data processing of a power system, and particularly relates to a method and a system for processing a PDF-format relay protection fixed value list.
Background
The Portable Document Format (PDF) has the advantages of being cross-platform, capable of retaining the original Format of a file, having higher universality and compatibility in various different operating systems, and capable of ensuring that data information of the file is not modified or changed due to coding types in the transmission process between different systems, so that the relay protection setting value list mostly adopts the PDF Format as a file information transmission mode. The object is used in the PDF file, the object form comprises text, image, music, video, font, hyperlink, encrypted information and the like, data is difficult to read for the PDF file with the object form being the image, when a software program needs to use the relay protection setting value list with the object form comprising the PDF format of the image, the current common processing method can only manually compare the file, analyze and record the content of the relay protection setting value list, and is low in efficiency and high in error rate.
Disclosure of Invention
The application aims to provide a method, a system, a computer readable storage medium and computer equipment for processing a PDF (Portable document Format) relay protection setting list, and aims to solve the problems of low efficiency and high error rate caused by the fact that when a software program needs to use the PDF relay protection setting list containing images in an object form, the content of the relay protection setting list is analyzed and recorded by adopting a manual comparison file.
In a first aspect, the present application provides a method for processing a relay protection fixed value list in a PDF format, where the method includes:
s101, acquiring a relay protection fixed value list in a PDF format of an object form containing an image;
s102, carrying out graphic and character recognition on the relay protection constant value list in the PDF format of the image contained in the object form through OCR, and recognizing a table in the relay protection constant value list and text contents in the table;
s103, analyzing the identified table based on the definition of the relay protection setting value single table model base, finding out a corresponding table definition by combining preset title row keywords of the relay protection setting value single table, and obtaining table key information of the relay protection setting value single table according to text contents in the table and combining text and coordinates outside the table;
and S104, according to different relay protection constant value lists, performing structured solidification on the key information of the table, storing the key information of the table into a database, and providing a data access interface.
In a second aspect, the present application provides a system for processing a relay protection fixed value list in a PDF format, where the system includes:
the acquisition module is used for acquiring a relay protection fixed value list in a PDF format, wherein the object form comprises an image;
the identification module is used for carrying out graphic character identification on the relay protection constant value list in the PDF format of the image contained in the object form through OCR (optical character recognition), and identifying a table in the relay protection constant value list and text contents in the table;
the analysis module is used for analyzing the identified table based on the definition of the relay protection setting value single table model base, finding out the corresponding table definition by combining the preset title row key words of the relay protection setting value single table, and obtaining the table key information of the relay protection setting value single table according to the text content in the table and combining the text and the coordinates outside the table;
and the storage module is used for performing structured solidification and storage on the table key information to a database according to different relay protection constant value lists and providing a data access interface.
In a third aspect, the present application provides a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the steps of the method for processing the relay protection setting list in the PDF format.
In a fourth aspect, the present application provides a computer device comprising:
one or more processors;
a memory; and
one or more computer programs, the processor and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, which when executed implement the steps of the method of processing a PDF-formatted relay protection setting sheet as described.
In the application, as the relay protection fixed value list in the PDF format containing the image in the object form is subjected to graphic character recognition through the OCR, the table in the relay protection fixed value list and the text content in the table are recognized; and analyzing the identified table based on the definition of the relay protection setting value single table model library, finding out the corresponding table definition by combining the preset title row key words of the relay protection setting value single table, and obtaining the table key information of the relay protection setting value single table according to the text content in the table and combining the text and the coordinates outside the table. Therefore, the relay protection setting list in the PDF format containing the image can be automatically identified, the application requirements related to the relay protection setting list are met, and the key information of the relay protection setting list can be efficiently and accurately extracted.
Drawings
Fig. 1 is a flowchart of a method for processing a relay protection fixed value list in a PDF format according to an embodiment of the present application.
Fig. 2 is a functional block diagram of a system for processing a relay protection fixed value list in a PDF format according to an embodiment of the present application.
Fig. 3 is a block diagram illustrating a specific structure of a computer device according to an embodiment of the present disclosure.
Detailed Description
In order to make the purpose, technical solution and beneficial effects of the present application more clear and more obvious, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In order to explain the technical solution described in the present application, the following description will be given by way of specific examples.
Please refer to fig. 1, which is a flowchart of a method for processing a PDF-formatted relay protection setting list according to an embodiment of the present application, where the method for processing a PDF-formatted relay protection setting list is mainly applied to a computer device for example, and the method for processing a PDF-formatted relay protection setting list according to an embodiment of the present application includes the following steps:
s101, obtaining a relay protection fixed value list in a PDF format of an image contained in an object form.
S102, carrying out graphic Character Recognition on the relay protection fixed value sheet in the PDF format of the object form including the image through OCR (Optical Character Recognition), and recognizing a table in the relay protection fixed value sheet and text contents in the table.
In an embodiment of the present application, S101 may specifically be:
the method comprises the steps of obtaining a relay protection setting value list in a PDF format, wherein an object form comprises an image, and scanning the relay protection setting value list to obtain an image file.
S102 may specifically include the following steps:
s1021, identifying the scanned image file through OCR, and obtaining a table line of each page of image through binarization processing;
s1022, because each page of image may have more than two tables and out-of-table texts, the table lines need to be analyzed, the longitudinally or transversely discontinuous table lines are regarded as two independent tables, the image is split, and a plurality of table areas and out-of-table text areas are obtained;
s1023, for the table area, dividing each table into a plurality of cell pictures through table line intersection points, and performing text positioning and text recognition on the cell pictures; and carrying out text positioning and text recognition operation on the out-of-table text area.
In an embodiment of the application, a CTPN + Densenet + CTC neural network can be used for text positioning and text recognition, a Convolutional Neural Network (CNN) VGG16 is used for feature extraction, a circulating neural network (RNN) BLSTM obtains a probability array related to all output character probabilities, and Densenet + CTC is combined for recognition of a text part.
In an embodiment of the present application, after S1023, the method may further include the following steps:
and according to the table characteristics of the relay protection constant value list, integrating the table attributes to judge whether the table is page-spread or not, and combining the page-spread table.
In an embodiment of the application, the step of judging whether the table spans pages according to the table features of the relay protection setting value list by integrating the table attributes may specifically include:
and rapidly filtering based on obvious features, excluding the tables which cannot be spread, and judging whether the tables are spread according to the characteristic attributes of the table columns, the width, the header content, the serial number columns and the like of the two tables before and after.
S103, analyzing the identified table based on the definition of the relay protection setting value single table model base, finding out the corresponding table definition by combining the preset title row key words of the relay protection setting value single table, and obtaining the table key information of the relay protection setting value single table according to the text content in the table and combining the text and the coordinates outside the table.
In an embodiment of the present application, the table key information includes a group to which a fixed value item belongs, a name of the fixed value item, and a fixed value item setting value.
In an embodiment of the present application, S103 may specifically be:
matching in a relay protection fixed value single table model base according to the cell size and the coordinate in each identified table, finding out a corresponding table definition by combining preset title row keywords of the relay protection fixed value single, and obtaining table key information of the relay protection fixed value single according to text contents in the table and combining text and coordinates outside the table.
In an embodiment of the application, the table style in the relay protection constant value single table model library includes information such as the number of columns of a table, semantics of each column, a reading sequence, and semantics of a merging cell.
And S104, according to different relay protection constant value lists, performing structured solidification on the key information of the table, storing the key information of the table into a database, and providing a data access interface.
In an embodiment of the present application, after S104, the method may further include the following steps:
and outputting the content of the relay protection constant value list in a JSON data format by calling a data access interface.
Referring to fig. 2, the system for processing the relay protection setting list in the PDF format according to an embodiment of the present application may be a computer program or a program code running in a computer device, for example, the system for processing the relay protection setting list in the PDF format is an application software; the system for processing the relay protection setting list in the PDF format can be used for executing corresponding steps in the method for processing the relay protection setting list in the PDF format provided by the embodiment of the application. The system for processing the relay protection fixed value list in the PDF format provided by the embodiment of the application comprises:
the acquisition module 11 is used for acquiring a relay protection fixed value list in a PDF format, wherein the relay protection fixed value list comprises an image in an object form;
the identification module 12 is used for performing graphic and character identification on the relay protection fixed value list in the PDF format of the image contained in the object form through OCR (optical character recognition), and identifying a table in the relay protection fixed value list and text contents in the table;
the analysis module 13 is configured to analyze the identified table based on the definition of the relay protection setting value single table model library, find a corresponding table definition by combining a preset title row keyword of the relay protection setting value single table, and obtain table key information of the relay protection setting value single table according to text content in the table and by combining a text and coordinates outside the table;
and the storage module 14 is used for performing structured solidification and storage on the table key information to a database according to different relay protection setting value lists and providing a data access interface.
The system for processing the relay protection setting list in the PDF format provided in an embodiment of the present application and the method for processing the relay protection setting list in the PDF format provided in an embodiment of the present application belong to the same concept, and a specific implementation process thereof is detailed throughout the entire specification and is not described herein again.
An embodiment of the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for processing the relay protection setting list in the PDF format, as provided in an embodiment of the present application.
Fig. 3 shows a specific block diagram of a computer device provided in an embodiment of the present application, where the computer device 100 includes: one or more processors 101, a memory 102, and one or more computer programs, wherein the processors 101 and the memory 102 are connected by a bus, the one or more computer programs are stored in the memory 102 and configured to be executed by the one or more processors 101, and the processor 101 implements the steps of the method for processing the relay protection setting sheet in PDF format as provided in an embodiment of the present application. The computer equipment comprises a server, a terminal and the like. The computer device may be a desktop computer, a mobile terminal or a vehicle-mounted device, and the mobile terminal includes at least one of a mobile phone, a tablet computer, a personal digital assistant or a wearable device.
In the application, as the relay protection fixed value list in the PDF format containing the image in the object form is subjected to graphic character recognition through the OCR, the table in the relay protection fixed value list and the text content in the table are recognized; and analyzing the identified table based on the definition of the relay protection setting value single table model library, finding out the corresponding table definition by combining the preset title row key words of the relay protection setting value single table, and obtaining the table key information of the relay protection setting value single table according to the text content in the table and combining the text and the coordinates outside the table. Therefore, the relay protection setting list in the PDF format containing the image can be automatically identified, the application requirements related to the relay protection setting list are met, and the key information of the relay protection setting list can be efficiently and accurately extracted.
It should be understood that the steps in the embodiments of the present application are not necessarily performed in the order indicated by the step numbers. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in various embodiments may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only show some embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for processing a relay protection fixed value list in a PDF format is characterized by comprising the following steps:
s101, acquiring a relay protection fixed value list in a PDF format of an object form containing an image;
s102, carrying out graphic and character recognition on the relay protection constant value list in the PDF format of the image contained in the object form through OCR, and recognizing a table in the relay protection constant value list and text contents in the table;
s103, analyzing the identified table based on the definition of the relay protection setting value single table model base, finding out a corresponding table definition by combining preset title row keywords of the relay protection setting value single table, and obtaining table key information of the relay protection setting value single table according to text contents in the table and combining text and coordinates outside the table;
and S104, according to different relay protection constant value lists, performing structured solidification on the key information of the table, storing the key information of the table into a database, and providing a data access interface.
2. The method according to claim 1, wherein S101 is specifically:
the method comprises the steps of obtaining a relay protection setting value list in a PDF format, wherein an object form comprises an image, and scanning the relay protection setting value list to obtain an image file.
3. The method of claim 2, wherein S102 specifically comprises:
s1021, identifying the scanned image file through OCR, and obtaining a table line of each page of image through binarization processing;
s1022, analyzing the table lines, regarding the longitudinally or transversely discontinuous table lines as two independent tables, splitting the image, and obtaining a plurality of table areas and text areas outside the tables;
s1023, for the table area, dividing each table into a plurality of cell pictures through table line intersection points, and performing text positioning and text recognition on the cell pictures; and carrying out text positioning and text recognition operation on the out-of-table text area.
4. The method of claim 3, wherein after S1023, the method further comprises:
and according to the table characteristics of the relay protection constant value list, integrating the table attributes to judge whether the table is page-spread or not, and combining the page-spread table.
5. The method of claim 4, wherein the step of judging whether the table is page-spread or not by integrating the table attributes according to the table characteristics of the relay protection setting list specifically comprises the steps of:
and rapidly filtering based on obvious features, excluding the tables which cannot be page-spread, and judging whether the page-spread tables exist or not according to the feature attributes of the two front and back pages of tables.
6. The method of claim 1, wherein the table key information includes a group to which a constant value item belongs, a constant value item name, and a constant value item setting value;
s103 specifically comprises the following steps:
matching in a relay protection fixed value single table model base according to the cell size and the coordinate in each identified table, finding out a corresponding table definition by combining preset title row keywords of the relay protection fixed value single, and obtaining table key information of the relay protection fixed value single according to text contents in the table and combining text and coordinates outside the table.
7. The method of claim 1, wherein after S104, the method further comprises:
and outputting the content of the relay protection constant value list in a JSON data format by calling a data access interface.
8. A system for processing a PDF-formatted relay protection fixed value list, the system comprising:
the acquisition module is used for acquiring a relay protection fixed value list in a PDF format, wherein the object form comprises an image;
the identification module is used for carrying out graphic character identification on the relay protection constant value list in the PDF format of the image contained in the object form through OCR (optical character recognition), and identifying a table in the relay protection constant value list and text contents in the table;
the analysis module is used for analyzing the identified table based on the definition of the relay protection setting value single table model base, finding out the corresponding table definition by combining the preset title row key words of the relay protection setting value single table, and obtaining the table key information of the relay protection setting value single table according to the text content in the table and combining the text and the coordinates outside the table;
and the storage module is used for performing structured solidification and storage on the table key information to a database according to different relay protection constant value lists and providing a data access interface.
9. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the steps of the method for processing a PDF-formatted relay protection setting list according to any one of claims 1 to 7.
10. A computer device, comprising:
one or more processors;
a memory; and
one or more computer programs, the processor and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, characterized in that the processor, when executing the computer programs, implements the steps of the method of processing a PDF-formatted relay protection setting list according to any one of claims 1 to 7.
CN202011418226.5A 2020-12-07 2020-12-07 Method and system for processing PDF-format relay protection fixed value list Pending CN112528832A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011418226.5A CN112528832A (en) 2020-12-07 2020-12-07 Method and system for processing PDF-format relay protection fixed value list

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011418226.5A CN112528832A (en) 2020-12-07 2020-12-07 Method and system for processing PDF-format relay protection fixed value list

Publications (1)

Publication Number Publication Date
CN112528832A true CN112528832A (en) 2021-03-19

Family

ID=74997132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011418226.5A Pending CN112528832A (en) 2020-12-07 2020-12-07 Method and system for processing PDF-format relay protection fixed value list

Country Status (1)

Country Link
CN (1) CN112528832A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113608480A (en) * 2021-08-11 2021-11-05 中国南方电网有限责任公司超高压输电公司贵阳局 Fixed value checking method of extra-high voltage direct current protection device
CN114239881A (en) * 2021-12-13 2022-03-25 国网河南省电力公司漯河供电公司 Relay protection fixed value checking method and system based on CNN technology

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344831A (en) * 2018-08-22 2019-02-15 中国平安人寿保险股份有限公司 A kind of tables of data recognition methods, device and terminal device
CN110263739A (en) * 2019-06-26 2019-09-20 四川新网银行股份有限公司 Photo table recognition methods based on OCR technique
CN110795919A (en) * 2019-11-07 2020-02-14 达而观信息科技(上海)有限公司 Method, device, equipment and medium for extracting table in PDF document
CN111027297A (en) * 2019-12-23 2020-04-17 海南港澳资讯产业股份有限公司 Method for processing key form information of image type PDF financial data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344831A (en) * 2018-08-22 2019-02-15 中国平安人寿保险股份有限公司 A kind of tables of data recognition methods, device and terminal device
CN110263739A (en) * 2019-06-26 2019-09-20 四川新网银行股份有限公司 Photo table recognition methods based on OCR technique
CN110795919A (en) * 2019-11-07 2020-02-14 达而观信息科技(上海)有限公司 Method, device, equipment and medium for extracting table in PDF document
CN111027297A (en) * 2019-12-23 2020-04-17 海南港澳资讯产业股份有限公司 Method for processing key form information of image type PDF financial data

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YANG CHENGUANG: "chinese_ocr: CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras", 《HTTPS://GITEE.COM/ROTHSWORD/CHINESE_OCR》 *
YCG09/CHINESE_OCR: "CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras", 《HTTPS://GITHUB.COM/SYSAU/CHINESE_OCR》 *
这辈子就是你: "chinese-ocr自然场景下不定长文字识别(ctpn+ densenet)", 《HTTPS://BLOG.CSDN.NET/WEIXIN_42861043/ARTICLE/DETAILS/89705021》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113608480A (en) * 2021-08-11 2021-11-05 中国南方电网有限责任公司超高压输电公司贵阳局 Fixed value checking method of extra-high voltage direct current protection device
CN114239881A (en) * 2021-12-13 2022-03-25 国网河南省电力公司漯河供电公司 Relay protection fixed value checking method and system based on CNN technology

Similar Documents

Publication Publication Date Title
CN110795919B (en) Form extraction method, device, equipment and medium in PDF document
CN109446173B (en) Log data processing method, device, computer equipment and storage medium
WO2019075969A1 (en) Method for extracting form information in a structured manner, electronic device, and computer-readable storage medium
CN107689070B (en) Chart data structured extraction method, electronic device and computer-readable storage medium
CN111191079B (en) Document content acquisition method, device, equipment and storage medium
WO2019041527A1 (en) Method of extracting chart in document, electronic device and computer-readable storage medium
CN112528832A (en) Method and system for processing PDF-format relay protection fixed value list
CN110909123B (en) Data extraction method and device, terminal equipment and storage medium
CN114357174B (en) Code classification system and method based on OCR and machine learning
CN113033165A (en) Spreadsheet file parsing method and device and computer readable storage medium
CN110889341A (en) Form image recognition method and device based on AI (Artificial Intelligence), computer equipment and storage medium
CN117235546B (en) Multi-version file comparison method, device, system and storage medium
CN115994232B (en) Online multi-version document identity authentication method, system and computer equipment
CN111357015B (en) Text conversion method, apparatus, computer device, and computer-readable storage medium
CN110909733A (en) Template positioning method and device based on OCR picture recognition and computer equipment
CN112463791A (en) Nuclear power station document data acquisition method and device, computer equipment and storage medium
CN115565193A (en) Questionnaire information input method and device, electronic equipment and storage medium
CN114169331A (en) Address resolution method, device, computer equipment and storage medium
CN113868411A (en) Contract comparison method and device, storage medium and computer equipment
CN113850265A (en) PDF document analysis method and device, electronic equipment and storage medium
CN113901950A (en) High-accuracy table OCR recognition method and system
CN112257718A (en) Text recognition method and device for radiology department films
CN111209759A (en) Webpage translation method and device, computer equipment and storage medium
CN112749294B (en) Page hidden text recognition method, device, computer equipment and storage medium
CN117077624B (en) Word stock online processing method, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210319