CN113505669A - Form extraction method and device in engineering drawing, electronic equipment and storage medium - Google Patents

Form extraction method and device in engineering drawing, electronic equipment and storage medium Download PDF

Info

Publication number
CN113505669A
CN113505669A CN202110729449.1A CN202110729449A CN113505669A CN 113505669 A CN113505669 A CN 113505669A CN 202110729449 A CN202110729449 A CN 202110729449A CN 113505669 A CN113505669 A CN 113505669A
Authority
CN
China
Prior art keywords
determining
elements
position information
row
engineering drawing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110729449.1A
Other languages
Chinese (zh)
Inventor
钟克强
祝汉武
夏晨曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wanyi Technology Co Ltd
Original Assignee
Wanyi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wanyi Technology Co Ltd filed Critical Wanyi Technology Co Ltd
Priority to CN202110729449.1A priority Critical patent/CN113505669A/en
Publication of CN113505669A publication Critical patent/CN113505669A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The application relates to a method and a device for extracting a form in an engineering drawing, electronic equipment and a storage medium, which are applied to the technical field of data processing. The method comprises the following steps: determining a region to be identified in an engineering drawing, wherein the region to be identified comprises first elements in a form to be extracted, and then acquiring position information of each first element in the region to be identified; determining the position relation among the first elements according to the position information; and storing the first element in a table form according to the position relation to obtain a table to be extracted in the engineering drawing. The method aims to solve the problem that the structured data of the table is classified, spliced and sorted in the related technology by adopting a line splicing mode. However, the number of lines in the building drawing is large, and the time complexity for splicing the lines is high, so that the algorithm is long in time consumption and low in efficiency.

Description

Form extraction method and device in engineering drawing, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of data processing, and in particular, to a method and an apparatus for extracting a form from an engineering drawing, an electronic device, and a storage medium.
Background
In recent years, more and more computer technologies have been applied to the conventional industry of design assistance. The design paper is one of the final product achievements of the design work, and related reviewers are required to review the related design paper so as to check the normative of the design achievement.
The examination of the table in the drawing is used as one of the examination items, and the table needs to be extracted for further examination.
The table extraction method in the related technology mainly adopts a line splicing method to classify, splice and arrange the structured data of the table. However, the number of lines in the building drawing is large, and the time complexity for splicing the lines is high, so that the algorithm is long in time consumption and low in efficiency.
Disclosure of Invention
The application provides a form extraction method, a form extraction device, electronic equipment and a storage medium in engineering drawings, which are used for classifying, splicing and arranging structured data of a form in a line splicing mode in the related technology. However, the number of lines in the building drawing is large, and the time complexity for splicing the lines is high, so that the algorithm is long in time consumption and low in efficiency.
Alternatively, the first and second electrodes may be,
in order to solve the technical problem or at least partially solve the technical problem, the application provides a method and a device for extracting a form in an engineering drawing, an electronic device and a storage medium.
In a first aspect, the present application provides a method for extracting a form from an engineering drawing, including:
determining a region to be identified in an engineering drawing, wherein the region to be identified comprises a first element in a form to be extracted;
acquiring position information of each first element in the area to be identified;
determining the position relation among the first elements according to the position information;
and storing the first element in a table form according to the position relation to obtain the table to be extracted in the engineering drawing.
Optionally, the determining the area to be identified in the engineering drawing includes:
identifying each original element in the engineering drawing;
matching the original elements with a first preset character element set, wherein the first preset character element set comprises character elements which represent that the table to be extracted is a preset type table;
and taking the area where the successfully matched original element is located as the area to be identified.
Optionally, the determining the position relationship between the first elements according to the position information includes:
determining a second element belonging to a row or column header in the first element;
determining a third element in the same column or line with the second element according to the position information;
aligning the third elements in a row or column direction according to the position information;
and taking the row-column relationship between the second element and the third element as the position relationship.
Optionally, the determining a second element belonging to a row or column heading in the first element includes:
matching the first element with a second preset text element set, wherein the second preset text element set comprises text elements which represent that the table to be extracted is a row title or a column title of a preset type table;
acquiring the vertical coordinate or horizontal coordinate of the successfully matched element;
and taking the first element of the position information in a first preset range of the ordinate or the abscissa as the second element.
Optionally, the position information includes an abscissa and an ordinate; determining a third element in the same column or row as the second element according to the position information includes:
acquiring the abscissa or the ordinate of the second element;
and taking the original element in a second preset range of the ordinate or the abscissa of the second element as the third element.
Optionally, the aligning the third element in the row or column direction according to the position information includes:
determining a target element in the second elements;
acquiring sub-target elements of a column or a row where the target elements are located;
acquiring the ordinate or abscissa of each sub-target element;
and taking a third element with the ordinate or the abscissa in a third preset range of the sub-target elements as the same row or column.
Optionally, the determining a position relationship between the first elements according to the position information further includes:
determining a fourth element matched with a third preset text element set in the first element, wherein the third preset text element set comprises text elements which represent that the table to be extracted is a preset type table and are outside the preset type table;
determining a fifth element in the preset range of the fourth element according to the position information, wherein the fifth element is a line element;
and determining the position relation between the fifth element and the third element according to the position information of the fifth element.
In a second aspect, the present application provides a form extracting apparatus in engineering drawings, including:
the system comprises a first determining module, a second determining module and a judging module, wherein the first determining module is used for determining a to-be-identified area in an engineering drawing, and the to-be-identified area comprises a first element in a to-be-extracted form;
the acquisition module is used for acquiring the position information of each first element in the area to be identified;
the second determining module is used for determining the position relation among the first elements according to the position information;
and the table extraction module is used for storing the first element in a table form according to the position relation to obtain table data in the engineering drawing.
In a third aspect, an electronic device is provided, which includes a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the steps of the form extraction method in the engineering drawing in any embodiment of the first aspect when executing the program stored in the memory.
In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, implements the steps of the form extraction method in the engineering drawing according to any one of the embodiments of the first aspect.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:
according to the method provided by the embodiment of the application, the area to be identified in the engineering drawing is determined, the area to be identified comprises first elements in the form to be extracted, and then position information of each first element in the area to be identified is obtained; determining the position relation among the first elements according to the position information; and storing the first element in a table form according to the position relation to obtain a table to be extracted in the engineering drawing. Therefore, when the form in the engineering drawing is extracted, splicing is carried out without depending on line information in the engineering drawing, the form to be extracted can be extracted through the position relation of elements in the form, the extraction process is more convenient, the extraction of the form cannot be influenced due to the deletion or the truncation of lines, and the efficiency of the form extraction process is higher.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic flow chart of a method for extracting a form in an engineering drawing according to an embodiment of the present application;
fig. 2 is a schematic diagram of a table to be extracted according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a form extraction apparatus in an engineering drawing according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a diagram illustrating a method for extracting a form in an engineering drawing according to an embodiment of the present disclosure, where the method may be applied to any form of electronic devices, such as a terminal and a server. As shown in fig. 1, the method may specifically include the following steps:
step 101, determining a to-be-identified area in the engineering drawing, wherein the to-be-identified area comprises a first element in a to-be-extracted form.
In some embodiments, more element information, such as lines, characters, numbers and the like, exists in the engineering drawing, and when a form in the engineering drawing is extracted, an area where the form exists in the engineering drawing is determined first. In practical application, different tables may have corresponding keywords, for example, when a table in an engineering drawing is a floor height table, the keywords include key information such as a floor, an elevation and the like, and therefore, the to-be-identified area including the table to be extracted may be determined by using the text information in the engineering drawing.
The engineering drawing may be, but is not limited to, an architectural engineering drawing (e.g., a drawing of at least one of an airport, a train station, a bus station, an office building, a residential building, a hospital, a museum, a tourist attraction, a church, a school, a park, etc.).
In a specific embodiment, determining the area to be identified in the engineering drawing includes:
identifying each original element in the engineering drawing; matching the original elements with a first preset character element set, wherein the first preset character element set comprises character elements which represent that a table to be extracted is a preset type table; and taking the area where the successfully matched original element is located as the area to be identified.
In some embodiments, the engineering drawing includes characters, numbers, lines, and the like, each original element in the engineering drawing is identified to determine a character element in the original element, and the original element is matched with the first preset character element set to determine an element in the first preset character element set from the original element.
The first preset text element set comprises text elements which represent that the form to be extracted is a preset type form, such as phrases and symbols applicable to engineering drawings. For example, referring to fig. 2, when the table to be extracted is a floor height table, the first preset text element set may include "1 floor", "one floor", "roof floor", and the like representing a floor, "39.65 to 78.92" representing a height, "C30" representing a concrete grade, "primary", "secondary", and the like representing an earthquake-resistant grade.
It is to be understood that, when the preset type tables are different, the first preset text element set may correspond to the corresponding preset type table, or the first preset text element set includes text elements of all types of tables.
Thus, after the text elements in the first preset text element set are identified, the region can be determined to be the region where the table to be extracted is located.
And 102, acquiring the position information of each first element in the area to be identified.
In some embodiments, the position information of the element in the drawing is often generated correspondingly in the drawing process of the engineering drawing, for example, after a line is drawn in the drawing, the coordinate value corresponding to the line is obtained according to the position in the coordinate system in the engineering drawing.
After the area to be identified is determined, the position information of each first element can be obtained from the element information of each element stored in the engineering drawing.
And 103, determining the position relation among the first elements according to the position information.
In some embodiments, each table has a row header or a column header to indicate what the row or column represents. And the elements in the row header or the column header are often in the edge position of the table, based on which, the row header or the column header in the table can be determined, and further, according to the position information of each row header or each column header, the first element in the row or the column where the element is located and the position relationship between the first elements are determined.
In an optional embodiment, determining the position relationship between the first elements according to the position information may specifically include:
determining a second element belonging to a row or column header in the first element; determining a third element in the same column or row with the second element according to the position information; aligning the third elements in a row or column direction according to the position information; and taking the row-column relationship between the second element and the third element as the position relationship.
In some embodiments, when determining the position relationship between the first elements, a second element belonging to a row header or a column header in the table to be extracted may be determined first.
In a specific embodiment, determining the second element belonging to the row or column heading in the first element comprises:
matching the first element with a second preset text element set, wherein the second preset text element set comprises text elements which represent that the table to be extracted is a row title or a column title of a preset type table; acquiring the vertical coordinate or horizontal coordinate of the successfully matched element; and taking the first element of the position information in a first preset range of the ordinate or the abscissa as a second element.
In some embodiments, the first element may be matched against a second set of predetermined text elements when determining the second element. For example, taking the table to be extracted as the level table, in the level table, the corresponding header generally includes a level number, a structure elevation, a level height, and the like, and by setting the text element corresponding to the header in the table in the second preset text element set, and matching the first element with the second preset text element set, the second element belonging to the row header or the column header in the first element can be determined.
It is understood that if the second element successfully matched does not include all row headers or column headers, all row headers or column headers may be determined according to the second element successfully matched.
For example, when the first element is matched with the second preset text element set, if only the layer number and the layer height in the first element are successfully matched, it may be determined whether the first element is a row title or a column title according to the coordinate information of the layer number and the layer height, and for the example of the row title, the vertical coordinates of the layer number and the layer height are equal, so that an element equal to the vertical coordinates of the layer number and the layer height may be further searched from the first element, and the element is used as the second element, thereby obtaining all the second elements belonging to the row or column title.
In a particular embodiment, the location information includes an abscissa and an ordinate; determining a third element in the same column or row as the second element according to the position information, comprising:
acquiring the abscissa or the ordinate of the second element; and taking the original element in a second preset range of the ordinate or the abscissa of the second element as a third element.
In some embodiments, after determining the second element belonging to a row or column header, the element of the column or row corresponding thereto may be determined based on the second element.
It is understood that, since the elements in the table are usually characters or numbers, the coordinate values thereof are usually a coordinate range, and when determining the third element, the original element in the second preset range of the ordinate or abscissa of the second element can be used as the third element.
Taking the second element as a line title as an example, the line title is the same as the line title of the second element on the same line, and the ordinate of the line title is the same as the ordinate of the line title; the element corresponding to the row title is in the same row with the second element, and the abscissa of the element is the same, so that the third element in the same row with the second element can be determined according to the abscissa of each second element. The second preset range may be set according to actual conditions, for example, the abscissa of the second element is x1-x2, and the second preset range may be x1 ± a-x2 ± b, where a and b are both preset values and may be set according to actual conditions.
In one embodiment, aligning the third elements in a row or column direction based on the position information comprises:
determining a target element in the second element; acquiring sub-target elements of a column or a row where the target elements are located; acquiring the ordinate or abscissa of each sub-target element; and taking a third element with the ordinate or the abscissa in a third preset range of the sub-target elements as the same row or column.
In some embodiments, after determining the row header and the column element corresponding to the row header, it is further required to align the column elements of different columns to obtain a complete table.
Specifically, one of the second elements may be used as a target element, for example, when the table to be extracted is a level table, an element whose row title is a level number may be used as the target element, and correspondingly, an underground first layer, an underground second layer, an underground third layer, a first layer, a second layer, and the like whose level numbers are in a row may be used as sub-target elements, and when an element in a row where the sub-target elements are located is aligned, the element in the same row as the sub-target element may be determined according to coordinate values of the sub-target element. Illustratively, the ordinate of the sub-target element "next floor" is y1-y2, and the third preset range may be y1 ± c-y2 ± d, where c and d are preset values and may be set according to actual situations. Thus, the alignment of the third element in the row or column direction can be achieved by determining the element in the same row or column as the sub-target element by its coordinates.
In the embodiment, the elements in the same row or column with the element are determined through the coordinates of the element, a mode of determining the element position depending on lines in a table in the related technology is abandoned, and the element can be still extracted according to the form of the original table when the lines of the table to be extracted are incomplete.
It should be noted that, in the process of aligning the third element in the row or column direction, if the ordinate or abscissa of a certain element includes the ordinates or abscissas of a plurality of sub-target elements, it can be determined that the position of the element in the table to be extracted belongs to the condition of the merged cell, and in the extraction process, the extraction is performed according to the condition of the element after the merged cell.
In a specific embodiment, determining a position relationship between the first elements according to the position information further includes:
determining a fourth element which is matched with a third preset character element set in the first element, wherein the third preset character element set comprises character elements which represent that the table to be extracted is a preset type table and are outside the preset type table; determining a fifth element in a preset range of the fourth element according to the position information, wherein the fifth element is a line element; and determining the position relation between the fifth element and the third element according to the position information of the fifth element.
In some embodiments, the table to be extracted is a normal table, that is, the width of each row in the table is the same, and the width of each column is the same, and in practical applications, there are cases where there is still table data outside the normal table. For different tables to be extracted, the data outside the corresponding specification table is also different, and for example, if the table to be extracted is a layer-height table, the data outside the table includes elements such as "bottom reinforced area", "constraint edge member", and the like.
In this embodiment, the first element is matched with the third preset text element set, and if the matching is successful, it is determined that the table to be extracted has data (i.e., the fourth element) outside the standard table. By the position information of the fourth element, the line element near the fourth element is determined, and thus the range indicated by the fourth element is determined.
Illustratively, the table to be extracted is a level table, the fourth element is a range of the bottom reinforced area in the level table, and 3 line segments exist near the bottom reinforced area, and the range of the bottom reinforced area in the level table is determined by comparing the position information of the fifth element with the position information of the third element and determining the position relationship between the fifth element and the third element, so as to determine the action range of the fourth element.
And 104, storing the first element in a table form according to the position relation to obtain a table to be extracted in the engineering drawing.
In some embodiments, after determining the position relationship corresponding to the row element and the column element in the table to be extracted, the table may be extracted from the engineering drawing according to the position relationship. Illustratively, the first element may be stored in python as a list, and output and display are performed in a business-required manner, and after the first element is stored, subsequent logic and business judgment may be facilitated.
It should be understood that, in the foregoing embodiment of the present application, the description of different row and column cases is only an example, and does not indicate that the present solution can only implement the above example, and when taking the example of a row, the column cases are similar to the above example, and for avoiding redundant description, the above example is not listed, and the column cases may refer to the example of the row case.
According to the method and the device, the positions and values of the elements belonging to the row titles or the column titles are determined according to the particularity of the data formats of the row titles or the column titles, the validity of the elements belonging to the row titles or the column titles is determined by means of keyword comparison in an auxiliary mode, double guarantee is achieved, and robustness is high.
In addition, compared with a table extraction mode in the prior art, the table extraction method has the advantages that the dependency of table extraction on lines in a table is weak, the lines only play a role in finding information in an auxiliary mode, and if two data of a certain column of elements in the table are not separated by transverse lines, the method can determine the separation lines between the two data according to the position relation between the elements. Therefore, if some lines in the table are missing or truncated, the extraction of the table information is not greatly influenced.
For the condition that the elements of the row header or the column header and the corresponding elements of the column or the row are not aligned or the elements have merging cells, a proper algorithm is adopted to ensure that the elements belonging to the row header or the column header and other elements are in one-to-one correspondence and have a certain tolerance space. And adding decision logic to the extra auxiliary graphic element information to determine the relation between the auxiliary graphic element information and the elements belonging to the row header or the column header.
And outputting the whole table to be extracted by adopting a standardized and structured data format, wherein the standardized and structured data format is used for representing the internal corresponding relation of data in the table, and is convenient for subsequent business and logical judgment.
Based on the same concept, the embodiment of the present application provides a form extraction device in engineering drawings, and the specific implementation of the device may refer to the description of the method embodiment section, and repeated details are not repeated, as shown in fig. 3, the device mainly includes:
the first determining module 301 is configured to determine a to-be-identified area in the engineering drawing, where the to-be-identified area includes a first element in a to-be-extracted form;
an obtaining module 302, configured to obtain position information of each first element in the area to be identified;
a second determining module 303, configured to determine a position relationship between the first elements according to the position information;
and the table extraction module 304 is configured to store the first element in a table form according to the position relationship, so as to obtain table data in the engineering drawing.
Based on the same concept, an embodiment of the present application further provides an electronic device, as shown in fig. 4, the electronic device mainly includes: a processor 401, a memory 402 and a communication bus 403, wherein the processor 401 and the memory 402 communicate with each other via the communication bus 403. The memory 402 stores a program executable by the processor 401, and the processor 401 executes the program stored in the memory 402, so as to implement the following steps:
determining a region to be identified in the engineering drawing, wherein the region to be identified comprises a first element in a form to be extracted;
acquiring position information of each first element in the area to be identified;
determining the position relation among the first elements according to the position information;
and storing the first element in a table form according to the position relation to obtain a table to be extracted in the engineering drawing.
The communication bus 403 mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus 403 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 4, but this does not indicate only one bus or one type of bus.
The Memory 402 may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the aforementioned processor 401.
The Processor 401 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc., and may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic devices, discrete gates or transistor logic devices, and discrete hardware components.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for extracting a table in an engineering drawing provided in any one of the method embodiments described above.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The available media may be magnetic media (e.g., floppy disks, hard disks, tapes, etc.), optical media (e.g., DVDs), or semiconductor media (e.g., solid state drives), among others.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for extracting a form in an engineering drawing is characterized by comprising the following steps:
determining a region to be identified in an engineering drawing, wherein the region to be identified comprises a first element in a form to be extracted;
acquiring position information of each first element in the area to be identified;
determining the position relation among the first elements according to the position information;
and storing the first element in a table form according to the position relation to obtain the table to be extracted in the engineering drawing.
2. The method of claim 1, wherein the determining the area to be identified in the engineering drawing comprises:
identifying each original element in the engineering drawing;
matching the original elements with a first preset character element set, wherein the first preset character element set comprises character elements which represent that the table to be extracted is a preset type table;
and taking the area where the successfully matched original element is located as the area to be identified.
3. The method according to claim 1, wherein the determining the position relationship between the first elements according to the position information comprises:
determining a second element belonging to a row or column header in the first element;
determining a third element in the same column or line with the second element according to the position information;
aligning the third elements in a row or column direction according to the position information;
and taking the row-column relationship between the second element and the third element as the position relationship.
4. The method of claim 3, wherein the determining the second element belonging to the row or column header in the first element comprises:
matching the first element with a second preset text element set, wherein the second preset text element set comprises text elements which represent that the table to be extracted is a row title or a column title of a preset type table;
acquiring the vertical coordinate or horizontal coordinate of the successfully matched element;
and taking the first element of the position information in a first preset range of the ordinate or the abscissa as the second element.
5. The method of claim 3, wherein the location information comprises an abscissa and an ordinate; determining a third element in the same column or row as the second element according to the position information includes:
acquiring the abscissa or the ordinate of the second element;
and taking the original element in a second preset range of the ordinate or the abscissa of the second element as the third element.
6. The method of claim 3, wherein aligning the third element in a row or column direction according to the position information comprises:
determining a target element in the second elements;
acquiring sub-target elements of a column or a row where the target elements are located;
acquiring the ordinate or abscissa of each sub-target element;
and taking a third element with the ordinate or the abscissa in a third preset range of the sub-target elements as the same row or column.
7. The method according to claim 1, wherein the determining a position relationship between the first elements according to the position information further comprises:
determining a fourth element matched with a third preset text element set in the first element, wherein the third preset text element set comprises text elements which represent that the table to be extracted is a preset type table and are outside the preset type table;
determining a fifth element in the preset range of the fourth element according to the position information, wherein the fifth element is a line element;
and determining the position relation between the fifth element and the third element according to the position information of the fifth element.
8. A form extraction device in engineering drawings is characterized by comprising:
the system comprises a first determining module, a second determining module and a judging module, wherein the first determining module is used for determining a to-be-identified area in an engineering drawing, and the to-be-identified area comprises a first element in a to-be-extracted form;
the acquisition module is used for acquiring the position information of each first element in the area to be identified;
the second determining module is used for determining the position relation among the first elements according to the position information;
and the table extraction module is used for storing the first element in a table form according to the position relation to obtain table data in the engineering drawing.
9. An electronic device is characterized by comprising a processor, a memory and a communication bus, wherein the processor and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for implementing the steps of the method for extracting a form in an engineering drawing according to any one of claims 1 to 7 when executing the program stored in the memory.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for extracting a form in an engineering drawing according to any one of claims 1 to 7.
CN202110729449.1A 2021-06-29 2021-06-29 Form extraction method and device in engineering drawing, electronic equipment and storage medium Pending CN113505669A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110729449.1A CN113505669A (en) 2021-06-29 2021-06-29 Form extraction method and device in engineering drawing, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110729449.1A CN113505669A (en) 2021-06-29 2021-06-29 Form extraction method and device in engineering drawing, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113505669A true CN113505669A (en) 2021-10-15

Family

ID=78009528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110729449.1A Pending CN113505669A (en) 2021-06-29 2021-06-29 Form extraction method and device in engineering drawing, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113505669A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114724158A (en) * 2022-04-21 2022-07-08 北京梦诚科技有限公司 Engineering quantity auditing method and system, electronic equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382717A (en) * 2020-03-17 2020-07-07 腾讯科技(深圳)有限公司 Table identification method and device and computer readable storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382717A (en) * 2020-03-17 2020-07-07 腾讯科技(深圳)有限公司 Table identification method and device and computer readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114724158A (en) * 2022-04-21 2022-07-08 北京梦诚科技有限公司 Engineering quantity auditing method and system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111695355B (en) Address text recognition method and device, medium and electronic equipment
US11182544B2 (en) User interface for contextual document recognition
KR20210042864A (en) Table recognition method, device, equipment, medium and computer program
US9384389B1 (en) Detecting errors in recognized text
US8566080B2 (en) Method and system for processing text
CN102779140A (en) Keyword acquiring method and device
CN110909123B (en) Data extraction method and device, terminal equipment and storage medium
US20210042518A1 (en) Method and system for human-vision-like scans of unstructured text data to detect information-of-interest
US20210264556A1 (en) Automatically attaching optical character recognition data to images
US20080008391A1 (en) Method and System for Document Form Recognition
CN113505669A (en) Form extraction method and device in engineering drawing, electronic equipment and storage medium
CN116682130A (en) Method, device and equipment for extracting icon information and readable storage medium
CN114283190A (en) Beam line splicing method, device, equipment and storage medium
US9672438B2 (en) Text parsing in complex graphical images
KR102138748B1 (en) Method and system for detecting and sorting string in line unit
CN115331247A (en) Document structure identification method and device, electronic equipment and readable storage medium
CN115761778A (en) Document reconstruction method, device, equipment and storage medium
KR20200091558A (en) Method and system for automating customs declaration process
CN115422125A (en) Electronic document automatic filing method and system based on intelligent algorithm
US11475686B2 (en) Extracting data from tables detected in electronic documents
US9342514B2 (en) Multicultural collaborative editing method, apparatus and program product
CN111723177A (en) Modeling method and device of information extraction model and electronic equipment
CN112784527A (en) Document merging method and device and electronic equipment
US12001486B2 (en) Identifying reference data in a source data set
CN117874307B (en) Engineering data field identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211015