CN117172212A - Catalog extraction method and device in drawing, electronic equipment and storage medium - Google Patents

Catalog extraction method and device in drawing, electronic equipment and storage medium Download PDF

Info

Publication number
CN117172212A
CN117172212A CN202311031166.5A CN202311031166A CN117172212A CN 117172212 A CN117172212 A CN 117172212A CN 202311031166 A CN202311031166 A CN 202311031166A CN 117172212 A CN117172212 A CN 117172212A
Authority
CN
China
Prior art keywords
catalog
information
target
text information
grouping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311031166.5A
Other languages
Chinese (zh)
Inventor
王宇涵
袁松岭
刘绍福
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Wanyi Digital Technology Co ltd
Original Assignee
Shenzhen Wanyi Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Wanyi Digital Technology Co ltd filed Critical Shenzhen Wanyi Digital Technology Co ltd
Priority to CN202311031166.5A priority Critical patent/CN117172212A/en
Publication of CN117172212A publication Critical patent/CN117172212A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a catalog extraction method, a device, electronic equipment and a storage medium in a drawing, which are applied to the technical field of computers, wherein the method comprises the following steps: determining a target table belonging to a drawing catalog in a drawing to be identified; judging whether line segment information exists in the target table; if not, extracting first text information in the target table; classifying the first text information to obtain at least one classification result; based on the classification result, carrying out longitudinal grouping and transverse grouping on the first text information to obtain a grouping result; determining a table structure of the target table based on the grouping result; and extracting the target table based on the table structure and the first text information to obtain the target table. The method solves the problems that in the prior art, the recognition workload is large, line segment and cell information is excessively depended, and non-standard tables and non-standard conditions cannot be compatible.

Description

Catalog extraction method and device in drawing, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and apparatus for extracting a catalog in a drawing, an electronic device, and a storage medium.
Background
Forms are common content of text. In operation it is often necessary to convert the form in the picture into an editable file format. Manual input is the simplest method, but this method is inefficient in handling large numbers of forms and is more prone to error.
In the related art, a method for identifying a drawing catalog is generally used, in which after a table image is acquired by using an image acquisition device, ocr identification and straight line detection are performed on each table image, and then a line segment feature is used to calculate and extract a cell and an internal text so as to identify the table.
However, the above-described recognition method is not only large in recognition workload, but also excessively depends on line segment and cell information, and cannot be compatible with nonstandard tables and nonstandard cases.
Disclosure of Invention
The application provides a catalog extraction method, a device, electronic equipment and a storage medium in a drawing, which are used for solving the problems that in the prior art, the recognition workload is large, line segment and cell information are excessively depended, and non-standard forms and non-standard conditions cannot be compatible.
In a first aspect, an embodiment of the present application provides a method for extracting a catalog in a drawing, including:
determining a target table belonging to a drawing catalog in a drawing to be identified;
judging whether line segment information exists in the target table;
if not, extracting first text information in the target table;
classifying the first text information to obtain at least one classification result;
based on the classification result, carrying out longitudinal grouping and transverse grouping on the first text information to obtain a grouping result;
determining a table structure of the target table based on the grouping result;
and extracting the target table based on the table structure and the first text information to obtain the target table.
Optionally, the determining the target table belonging to the drawing catalog in the drawing to be identified includes:
determining an initial form in the drawing to be identified;
extracting second text information and coordinate information in the initial table;
a target form in the initial form is determined based on the second text information and the coordinate information.
Optionally, the classifying the first text information to obtain at least one classification result includes:
inputting the first text information into a text classification model, and outputting the classification of each piece of the first text information through the text classification model to obtain the classification result.
Optionally, the performing, based on the classification result, vertical grouping and horizontal grouping on the first text information to obtain a grouping result includes:
determining first position information of each piece of first text information;
and carrying out longitudinal grouping and transverse grouping on the first text information according to the first position information.
Optionally, after the determining whether the line segment information exists in the target table, the method further includes:
if the line segment information exists, integrating the line segment information included in the target catalog to obtain target line segment information;
and determining a table structure of the drawing catalog based on the target line segment information.
Optionally, integrating the line segment information included in the target directory to obtain target line segment information, including:
judging whether any two line segments are relatively overlapped or not based on the line segment information;
if so, merging the first end points at one ends of the two overlapped line segments to enable the two line segments to be merged to obtain the target line segment information; and/or the number of the groups of groups,
judging whether the distance between the second endpoints of any two line segments is within a preset range or not based on the line segment information;
if yes, merging the second endpoints to obtain the target line segment information.
Optionally, extracting the target table based on the table structure and the first text information, and after obtaining the target table, further includes:
determining sequence number information in the first text information;
if the sequence number information is discontinuous, determining the missing sequence number, and supplementing the missing sequence number in the target table.
In a second aspect, an embodiment of the present application provides a catalog extraction apparatus in a drawing, including:
the acquisition module is used for determining a target table belonging to a drawing catalog in the drawing to be identified;
the judging module is used for judging whether line segment information exists in the target table;
the first extraction module is used for extracting the first text information in the target table if the first text information does not exist;
the classification module is used for classifying the first text information to obtain at least one classification result;
the grouping module is used for longitudinally grouping and transversely grouping the first text information based on the classification result to obtain a grouping result;
a determining module, configured to determine a table structure of the target table based on the grouping result;
and the second extraction module is used for extracting the target table based on the table structure and the first text information to obtain the target table.
In a third aspect, an embodiment of the present application provides an electronic device, including: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory is used for storing a computer program;
the processor is configured to execute the program stored in the memory, and implement the catalog extraction method in the drawing according to the first aspect.
In a fourth aspect, an embodiment of the present application provides a computer readable storage medium storing a computer program, where the computer program when executed by a processor implements the method for extracting a catalog in a drawing according to the first aspect.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages: according to the method provided by the embodiment of the application, the target table belonging to the drawing catalog in the drawing to be identified is determined; judging whether line segment information exists in the target table; if not, extracting first text information in the target table; classifying the first text information to obtain at least one classification result; based on the classification result, carrying out longitudinal grouping and transverse grouping on the first text information to obtain a grouping result; determining a table structure of the target table based on the grouping result; and extracting the target table based on the table structure and the first text information to obtain the target table. Therefore, the problem of large calculation amount caused by extracting all tables and determining the drawing catalogue can be avoided by firstly determining the target table belonging to the drawing catalogue in the drawing to be identified and then extracting the target table. In addition, the first text information in the target table is processed, so that the table structure of the target table can be determined, the target table is extracted by utilizing the table structure and the first text information, the extraction of the target table can be realized without relying on line segments, and the problems of nonstandard tables and nonstandard conditions can be compatible.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
Fig. 1 is an application scenario diagram of a catalog extraction method in a drawing provided by an embodiment of the present application;
FIG. 2 is a flowchart of a method for extracting a catalog in a drawing according to an embodiment of the present application;
FIG. 3 is a flowchart of a method for extracting a catalog in a drawing according to another embodiment of the present application;
FIG. 4 is a block diagram of a catalog extraction apparatus in a drawing provided by an embodiment of the present application;
fig. 5 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
According to an embodiment of the application, a catalog extraction method in a drawing is provided. Alternatively, in the embodiment of the present application, the catalog extraction method in the drawing may be applied to a hardware environment composed of the terminal 101 and the server 102 as shown in fig. 1. As shown in fig. 1, the server 102 is connected to the terminal 101 through a network, which may be used to provide services (such as application services, etc.) to the terminal or clients installed on the terminal, and a database may be provided on the server or independent of the server, for providing data storage services to the server 102, where the network includes, but is not limited to: the terminal 101 is not limited to a PC, a mobile phone, a tablet computer, or the like.
The catalog extraction method in the drawing according to the embodiment of the present application may be executed by the server 102, may be executed by the terminal 101, or may be executed by both the server 102 and the terminal 101. The terminal 101 may execute the catalog extraction method in the drawing according to the embodiment of the present application, or may be executed by a client installed thereon.
Taking a terminal to execute the catalog extraction method in the drawing according to the embodiment of the present application as an example, fig. 2 is a schematic flow chart of an alternative catalog extraction method in the drawing according to the embodiment of the present application, as shown in fig. 2, the flow of the method may include the following steps:
step 201, determining a target table belonging to a drawing catalog in a drawing to be identified;
step 202, extracting first text information in the target table;
step 203, judging whether line segment information exists in the target table; if there is no execution step 204, if there is execution step 208;
step 204, classifying the first text information to obtain at least one classification result;
step 205, based on the classification result, performing longitudinal grouping and transverse grouping on the first text information to obtain a grouping result;
step 206, determining a table structure of the target table based on the grouping result;
and step 207, extracting the target table based on the table structure and the first text information to obtain the target table.
In some embodiments, after determining the target table belonging to the drawing catalog in the drawing to be identified, extracting the target table can avoid the problem of large calculation amount caused by extracting all tables and determining the drawing catalog. In addition, the first text information in the target table is processed, so that the table structure of the target table can be determined, the target table is extracted by utilizing the table structure and the first text information, the extraction of the target table can be realized without relying on line segments, and the problems of nonstandard tables and nonstandard conditions can be compatible.
The primitive structure in the drawing is multiple in level, various special structures such as blocks, view ports and the like exist, non-frozen primitives which need to be positioned to a visual layer are traversed and nested according to the multi-level primitive structure of the CAD original file, and primitive information of all line segment types and text types is obtained.
In an alternative embodiment, in the case where there is line segment information, the method further includes:
step 208, integrating the line segment information included in the target directory to obtain target line segment information; and determining a table structure of the drawing catalog based on the target line segment information.
And after step 208, step 207 is performed.
In an optional embodiment, integrating the line segment information included in the target directory to obtain target line segment information includes:
judging whether any two line segments are relatively overlapped or not based on the line segment information;
if so, merging the first end points at one ends of the two overlapped line segments to enable the two line segments to be merged to obtain the target line segment information; and/or the number of the groups of groups,
judging whether the distance between the second endpoints of any two line segments is within a preset range or not based on the line segment information;
if yes, merging the second endpoints to obtain the target line segment information.
In some embodiments, a complete line segment table structure exists for a portion of the CAD catalog, but it is also possible that there are no vertical line segments, only horizontal lines, or no line segments at all. Therefore, if it is not determined whether the internal structure of the table is complete, an extraction attempt of the internal structure of the table needs to be performed first, if the standard table structure is found, the extraction attempt is utilized, and if the standard table structure is not found, other steps are relied on.
Specifically, first, line segment error correction is performed, and there may be problems of missing points, intersecting points that appear to intersect in reality, repeated line segments, and the like in the CAD original line segment. The alignment angle difference is controlled by a threshold value (for example, 0.5 degrees) by performing the merging process on the alignment lines where the alignment lines partially overlap. Next, it is calculated whether or not the perpendicularly intersecting line segments have an error at the intersection point, and the straight line extension or truncation process is performed on the intersection error within a certain threshold value (for example, may be 0.001 meter). And finally, merging the endpoints close to each other.
Breaking the intersecting lines from the intersection points to form a plurality of lines, and finally merging the endpoints with close distances. After corrected and perfected line segment data are obtained, a minimum closed polygon algorithm is used for closed rectangle recognition, and each closed rectangle is a final cell.
In an optional embodiment, the determining the target table belonging to the drawing catalog in the drawing to be identified includes:
determining an initial form in the drawing to be identified;
extracting second text information and coordinate information in the initial table;
a target form in the initial form is determined based on the second text information and the coordinate information.
In some embodiments, various table structures exist in CAD drawings, but not every table is a drawing catalog, the table contents may be a title bar, a layer list, a building material list, a symbol table, a door and window list, etc., and a simple keyword cannot be relied on to determine whether the drawing catalog is the drawing catalog, so that a type determination is required for the suspected table contents.
And judging the type of the table, namely using deep learning, inputting all text and coordinate information in a frame range of line segment connection, and outputting the type of the table.
Firstly, text information is encoded into text features through a hidden layer, then the coordinate position of the text is converted into relative position encoding based on the outer frame of the table, and the two features are input. And processing the characteristics by using a transducer model, then using a linear classifier, and finally outputting a classification result to judge whether the classification result is a drawing catalog.
Further, the determination of the target table may use keyword or rule matching in addition to deep learning.
In an alternative embodiment, classifying the first text information to obtain at least one classification result includes:
inputting the first text information into a text classification model, and outputting the classification of each piece of the first text information through the text classification model to obtain the classification result.
In some embodiments, since the text within the rectangular box of the directory is not necessarily all directory line content, there may be headers, title bars, personnel signatures, and other auxiliary types of text information, so that a determination of the content type of the text is required. For a catalog in a CAD drawing, the most important is in the catalog line: serial number, figure number and figure name. The link utilizes a text classification model to identify all texts by serial numbers, figure names and other four types, and the network structure uses bert coding and mlp multi-layer perception classification prediction so as to obtain a classification result of the first text information.
The recognition of sequence numbers, picture names and picture numbers can also use regular expressions for text matching besides deep learning.
In an alternative embodiment, based on the classification result, the first text information is vertically grouped and horizontally grouped to obtain a grouping result, which includes:
determining first position information of each piece of first text information;
and carrying out longitudinal grouping and transverse grouping on the first text information according to the first position information.
In an alternative embodiment, after extracting the target table based on the table structure and the first text information, the method further includes:
determining sequence number information in the first text information;
if the sequence number information is discontinuous, determining the missing sequence number, and supplementing the missing sequence number in the target table.
In some embodiments, referring to fig. 3, the extracted text classification result data is grouped by column according to cross-column information. If the table data is previously extracted, column-wise grouping is performed using the natural spatial structure of the table data, and filtering is performed with reference to header information. If the table is not extracted, the classified text data is utilized to carry out self-organizing grouping (vertical direction), and filtering, checking splitting and merging are carried out through the table header and the type information.
And adopting a voting mechanism to be compatible under the condition of no header, and finally, carrying out intra-group sequencing to generate effective serial number, figure number and figure name grouping data. And grouping and pairing (transversely) effective serial numbers, drawing numbers and drawing names according to the line of the centroid of the text circumscribed rectangle, performing secondary detection according to the matching relative position and distance, and finally pairing to generate directory line object data.
The directory line matching result is automatically supplemented with abnormal sequence numbers, because the sequence numbers are generally continuous numbers and letters, a text searching range is obtained through the directory line with abnormal sequence numbers, and abnormal sequence numbers are supplemented by using the same line of the centroid.
The catalog line of the middle leakage is retrieved by acquiring the middle automatic retrieving range, the catalog line of the tail leakage is retrieved by the tail retrieving range, and then the catalog line data are subjected to global ordering to output a final result.
Compared with the existing method, the catalog extraction method in the drawing provided by the application has the advantages that the type judgment is carried out by using deep learning in advance, so that the identification of non-catalog forms is avoided. The actual semantics of the text content are judged first, so that invalid interference text can be prevented from being added into the catalogue.
By judging the form types in advance, the generation of invalid data can be reduced to a limited extent, and irrelevant form information is prevented from being misidentified as a catalog. By means of deep learning matching rule pairing, dependence on line data is low, and more irregular conditions can be compatible. Deep learning can avoid that an interference text is recognized as effective information, catalog start and stop can be well judged, and rule pairing can solve some problems of omission and special condition compatibility by means of post-processing. The combination of the two can obtain better effect.
Based on the same conception, the embodiment of the present application provides a catalog extraction device in a drawing, and the specific implementation of the device may be referred to the description of the embodiment of the method, and the repetition is omitted, as shown in fig. 4, where the device mainly includes:
an obtaining module 401, configured to determine a target table belonging to a drawing catalog in a drawing to be identified;
a judging module 402, configured to judge whether line segment information exists in the target table;
a first extracting module 403, configured to extract, if not present, first text information in the target table;
the classification module 404 is configured to classify the first text information to obtain at least one classification result;
a grouping module 405, configured to perform a vertical grouping and a horizontal grouping on the first text information based on the classification result, so as to obtain a grouping result;
a determining module 406, configured to determine a table structure of the target table based on the grouping result;
and a second extraction module 407, configured to extract the target table based on the table structure and the first text information, so as to obtain the target table.
Based on the same conception, the embodiment of the application also provides an electronic device, as shown in fig. 5, which mainly comprises: processor 501, memory 502 and communication bus 503, wherein processor 501 and memory 502 accomplish the communication between each other through communication bus 503. The memory 502 stores a program executable by the processor 501, and the processor 501 executes the program stored in the memory 502 to implement the following steps:
determining a target table belonging to a drawing catalog in a drawing to be identified;
judging whether line segment information exists in the target table;
if not, extracting first text information in the target table;
classifying the first text information to obtain at least one classification result;
based on the classification result, carrying out longitudinal grouping and transverse grouping on the first text information to obtain a grouping result;
determining a table structure of the target table based on the grouping result;
and extracting the target table based on the table structure and the first text information to obtain the target table.
The communication bus 503 mentioned in the above electronic device may be a peripheral component interconnect standard (Peripheral Component Interconnect, abbreviated to PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated to EISA) bus, or the like. The communication bus 503 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 5, but not only one bus or one type of bus.
The memory 502 may include random access memory (Random Access Memory, simply RAM) or may include non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor 501.
The processor 501 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), a digital signal processor (Digital Signal Processing, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field programmable gate array (Field-Programmable Gate Array, FPGA), or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components.
In yet another embodiment of the present application, there is also provided a computer-readable storage medium having stored therein a computer program which, when run on a computer, causes the computer to execute the catalog extraction method in the drawing described in the above embodiment.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, by a wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, microwave, etc.) means from one website, computer, server, or data center to another. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape, etc.), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is only a specific embodiment of the application to enable those skilled in the art to understand or practice the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. The catalog extraction method in the drawing is characterized by comprising the following steps of:
determining a target table belonging to a drawing catalog in a drawing to be identified;
extracting first text information in the target table;
judging whether line segment information exists in the target table;
if not, classifying the first text information to obtain at least one classification result;
based on the classification result, carrying out longitudinal grouping and transverse grouping on the first text information to obtain a grouping result;
determining a table structure of the target table based on the grouping result;
and extracting the target table based on the table structure and the first text information to obtain the target table.
2. The method for extracting a catalog from a drawing according to claim 1, wherein the determining a target form belonging to the catalog of the drawing in the drawing to be identified comprises:
determining an initial form in the drawing to be identified;
extracting second text information and coordinate information in the initial table;
a target form in the initial form is determined based on the second text information and the coordinate information.
3. The method for extracting a catalog from a drawing according to claim 1, wherein the classifying the first text information to obtain at least one classification result comprises:
inputting the first text information into a text classification model, and outputting the classification of each piece of the first text information through the text classification model to obtain the classification result.
4. The method for extracting a catalog from a drawing according to claim 1, wherein the step of vertically grouping and horizontally grouping the first text information based on the classification result to obtain a grouping result includes:
determining first position information of each piece of first text information;
and carrying out longitudinal grouping and transverse grouping on the first text information according to the first position information.
5. The method for extracting a catalog in a drawing according to claim 1, wherein after determining whether line segment information exists in the target table, further comprising:
if the line segment information exists, integrating the line segment information included in the target catalog to obtain target line segment information;
and determining a table structure of the drawing catalog based on the target line segment information.
6. The method for extracting a catalog from a drawing of claim 5, wherein integrating the line segment information included in the target catalog to obtain target line segment information comprises:
judging whether any two line segments are relatively overlapped or not based on the line segment information;
if so, merging the first end points at one ends of the two overlapped line segments to enable the two line segments to be merged to obtain the target line segment information; and/or the number of the groups of groups,
judging whether the distance between the second endpoints of any two line segments is within a preset range or not based on the line segment information;
if yes, merging the second endpoints to obtain the target line segment information.
7. The method for extracting a catalog in a drawing according to claim 1, wherein the extracting the target form based on the form structure and the first text information, after obtaining the target form, further comprises:
determining sequence number information in the first text information;
if the sequence number information is discontinuous, determining the missing sequence number, and supplementing the missing sequence number in the target table.
8. A catalog extraction apparatus in a drawing sheet, comprising:
the acquisition module is used for determining a target table belonging to a drawing catalog in the drawing to be identified;
the judging module is used for judging whether line segment information exists in the target table;
the first extraction module is used for extracting the first text information in the target table if the first text information does not exist;
the classification module is used for classifying the first text information to obtain at least one classification result;
the grouping module is used for longitudinally grouping and transversely grouping the first text information based on the classification result to obtain a grouping result;
a determining module, configured to determine a table structure of the target table based on the grouping result;
and the second extraction module is used for extracting the target table based on the table structure and the first text information to obtain the target table.
9. An electronic device, comprising: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory is used for storing a computer program;
the processor is configured to execute the program stored in the memory, and implement the catalog extraction method in the drawing according to any one of claims 1 to 7.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the method for extracting a catalog in a drawing according to any one of claims 1 to 7.
CN202311031166.5A 2023-08-15 2023-08-15 Catalog extraction method and device in drawing, electronic equipment and storage medium Pending CN117172212A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311031166.5A CN117172212A (en) 2023-08-15 2023-08-15 Catalog extraction method and device in drawing, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311031166.5A CN117172212A (en) 2023-08-15 2023-08-15 Catalog extraction method and device in drawing, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117172212A true CN117172212A (en) 2023-12-05

Family

ID=88946026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311031166.5A Pending CN117172212A (en) 2023-08-15 2023-08-15 Catalog extraction method and device in drawing, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117172212A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117390812A (en) * 2023-12-11 2024-01-12 江西少科智能建造科技有限公司 CAD drawing warm ventilation pipe structured information extraction method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117390812A (en) * 2023-12-11 2024-01-12 江西少科智能建造科技有限公司 CAD drawing warm ventilation pipe structured information extraction method and system
CN117390812B (en) * 2023-12-11 2024-03-08 江西少科智能建造科技有限公司 CAD drawing warm ventilation pipe structured information extraction method and system

Similar Documents

Publication Publication Date Title
CN114821622B (en) Text extraction method, text extraction model training method, device and equipment
EP3117369B1 (en) Detecting and extracting image document components to create flow document
CN111931774B (en) Method and system for warehousing medicine data
CN107204960B (en) Webpage identification method and device and server
CN110222695B (en) Certificate picture processing method and device, medium and electronic equipment
CN117172212A (en) Catalog extraction method and device in drawing, electronic equipment and storage medium
CN111627015A (en) Small sample defect identification method, device, equipment and storage medium
WO2020056968A1 (en) Data denoising method and apparatus, computer device, and storage medium
CN116935430A (en) Picture frame identification method and device, electronic equipment and storage medium
CN109299205B (en) Method and device for warehousing spatial data used by planning industry
CN114005126A (en) Table reconstruction method and device, computer equipment and readable storage medium
CN114780746A (en) Knowledge graph-based document retrieval method and related equipment thereof
CN113076961B (en) Image feature library updating method, image detection method and device
CN114448664A (en) Phishing webpage identification method and device, computer equipment and storage medium
CN113704184A (en) File classification method, device, medium and equipment
EP3564833A1 (en) Method and device for identifying main picture in web page
CN112199499A (en) Text division method, text classification method, device, equipment and storage medium
CN111611388A (en) Account classification method, device and equipment
CN117009968A (en) Homology analysis method and device for malicious codes, terminal equipment and storage medium
US9530070B2 (en) Text parsing in complex graphical images
CN114996360B (en) Data analysis method, system, readable storage medium and computer equipment
CN114969439A (en) Model training and information retrieval method and device
CN114627462A (en) Chemical formula identification method and device, computer equipment and storage medium
CN114417860A (en) Information detection method, device and equipment
CN113989632A (en) Bridge detection method and device for remote sensing image, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination