CN113537026B

CN113537026B - Method, device, equipment and medium for detecting graphic elements in building plan

Info

Publication number: CN113537026B
Application number: CN202110775938.0A
Authority: CN
Inventors: 崔淼; 陈成才
Original assignee: Shanghai Xiaoi Robot Technology Co Ltd
Current assignee: Shanghai Xiaoi Robot Technology Co Ltd
Priority date: 2021-07-09
Filing date: 2021-07-09
Publication date: 2023-05-23
Anticipated expiration: 2041-07-09
Also published as: CN113537026A

Abstract

The embodiment of the invention discloses a method, a device, equipment and a medium for detecting graphic elements in a building plan. The method comprises the following steps: extracting multichannel basic image features under multiple scales from a building plan to be identified; after the receptive field of the multi-channel basic image features under each scale is increased by adopting a cavity convolution algorithm, the multi-channel basic image features under each scale are subjected to feature fusion to obtain multi-channel fusion image features; obtaining a plurality of segmentation graphs according to the fusion image characteristics of the multiple channels; and combining the primitives with different kernel proportions in each divided graph by adopting a progressive expansion algorithm to obtain at least one primitive region, and acquiring a primitive identification result corresponding to the primitive region. In the technical scheme, the characteristics extracted from the building plan are processed based on the artificial intelligence algorithm, so that the primitive identification result is obtained, the precise detection of the primitives in the building plan is realized, and the problems of missed detection and false detection caused by shielding or interference are avoided.

Description

Method, device, equipment and medium for detecting graphic elements in building plan

Technical Field

The embodiment of the invention relates to the technical field of image processing, in particular to a method, a device, equipment and a medium for detecting graphic primitives in a building plan.

Background

With the rapid development of artificial intelligence technology, the artificial intelligence technology has been widely applied to various application scenes such as financial services, medical imaging, sequencing diagnosis, machine vision, industrial detection and the like.

Currently, in the building industry, especially in the building drawing, the auditing information is large, for example, whether the components (i.e. the primitives) in the building plan completely identify the total size, the gap or the depth size of the plane, and whether the indexes (such as the area, the length, the perimeter and the like) of the components are calculated according to the specification requirements or the planning conditions. However, when analyzing the building plan, the detection of the building plan is often affected due to the diversity of the component background in the building plan, shielding of other components, interference of auxiliary lines and characters, and the like, resulting in the problems of missed detection and false detection. Therefore, how to realize accurate detection of the primitives in the building plan based on the artificial intelligence algorithm, and avoid the problems of missed detection and false detection caused by shielding or interference are urgent to be solved.

Disclosure of Invention

The embodiment of the invention provides a method, a device, equipment and a medium for detecting primitives in a building plan, which are used for realizing accurate detection of the primitives in the building plan based on an artificial intelligence algorithm and avoiding the problems of missed detection and false detection caused by shielding or interference.

In a first aspect, an embodiment of the present invention provides a method for detecting primitives in a building plan, including:

extracting multichannel basic image features under multiple scales from a building plan to be identified;

after the receptive field of the multi-channel basic image features under each scale is increased by adopting a cavity convolution algorithm, the multi-channel basic image features under each scale are subjected to feature fusion to obtain multi-channel fusion image features;

obtaining a plurality of segmentation graphs according to the fusion image characteristics of the multiple channels, wherein the primitives included in different segmentation graphs correspond to different kernel scales;

and combining the primitives with different kernel proportions in each divided graph by adopting a progressive expansion algorithm to obtain at least one primitive region, and acquiring a primitive identification result corresponding to the primitive region.

In a second aspect, an embodiment of the present invention further provides a primitive detection device in a building plan, including:

The multi-channel basic image feature extraction module is used for extracting multi-channel basic image features under multiple scales from a building plan to be identified;

the multi-channel fusion image feature generation module is used for carrying out feature fusion on the multi-channel basic image features under each scale after the receptive field of the multi-channel basic image features under each scale is increased by adopting a cavity convolution algorithm to obtain multi-channel fusion image features;

the segmentation map generation module is used for obtaining a plurality of segmentation maps according to the fusion image characteristics of the multiple channels, and the primitives included in different segmentation maps correspond to different kernel scales;

the primitive identification result acquisition module is used for combining the primitives with different kernel proportions in each divided graph by adopting a progressive expansion algorithm to obtain at least one primitive region and acquiring a primitive identification result corresponding to the primitive region.

In a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the primitive detection method in the building plan according to any embodiment of the present invention when the processor executes the program.

In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where a computer program is stored, where the program when executed by a processor implements a primitive detection method in a building plan according to any embodiment of the present invention.

According to the technical scheme provided by the embodiment of the invention, multi-channel basic image features under multiple scales are extracted from the building plan to be identified, a cavity convolution algorithm is adopted, the receptive fields of the multi-channel basic image features under each scale are increased, then feature fusion is carried out, multi-channel fusion image features are obtained, then multiple segmentation graphs are obtained according to the multi-channel fusion image features, then the primitives with different kernel proportions in each segmentation graph are combined through a progressive expansion algorithm, at least one primitive area is obtained, the primitive identification result corresponding to the primitive area is obtained, the features extracted from the building plan are processed through an artificial intelligence algorithm, the detection effect of the primitives can be effectively improved, the precise detection of the primitives in the building plan is achieved, and the problems of missed detection and false detection caused by shielding or interference are avoided.

Drawings

FIG. 1 is a schematic flow chart of a method for detecting primitives in a building plan according to a first embodiment of the present invention;

FIG. 2a is a schematic flow chart of a primitive detection method in a building plan according to a second embodiment of the present invention;

FIG. 2b is a schematic diagram of a model structure for obtaining primitive recognition results of a building plan according to a second embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a primitive detecting device in a building plan view according to a third embodiment of the present invention;

fig. 4 is a schematic hardware structure of a computer device in a fourth embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.

Before discussing exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart depicts operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently, or at the same time. Furthermore, the order of the operations may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figures. The processes may correspond to methods, functions, procedures, subroutines, and the like.

Example 1

Fig. 1 is a flowchart of a primitive detection method in a building plan according to an embodiment of the present invention, where the embodiment of the present invention is applicable to a case of accurately detecting primitives in a building plan, and the method may be performed by a primitive detection device in a building plan according to the embodiment of the present invention, where the device may be implemented in a software and/or hardware manner, and may generally be integrated in a computer device.

As shown in fig. 1, the method for detecting primitives in a building plan provided in this embodiment specifically includes:

s110, extracting multi-channel basic image features under multiple scales from the building plan to be identified.

The building plan to be identified refers to a building plan to be subjected to primitive detection. In the embodiment of the present invention, the graphic element refers to a member in a building plan, that is, each element constituting a building, such as a floor, a wall, a roof beam, etc.

Multichannel basis image features refer to image features at multi-dimensional scales extracted in a building plan.

In order to enable the extracted image features to better characterize different areas of the building plan, multichannel basis image features at multiple scales in the building plan can be extracted. For multi-channel basic image features with high image resolution (namely low dimension), the method has rich detail information and smaller receptive field, and is suitable for detecting small targets; and for the multichannel basic image characteristics with low image resolution (namely high dimension), the method has higher image semantic information and larger receptive field, and is suitable for detecting a large target. The Receptive Field (Receptive Field) refers to the area size mapped by the pixel points on the image features output by each layer of the convolutional neural network on the input picture, and in the embodiment of the invention, the Receptive Field refers to the area size mapped by the multichannel basic image features on the building plan, that is, each point on the multichannel basic image features corresponds to the area on the building plan.

It should be noted that, before extracting the multi-channel basic image features under multiple scales from the building plan to be identified, any target detection algorithm in the prior art, such as YOLO-v3 (You Only Look Once-version 3) algorithm or SSD (Single Shot MultiBox Detector) algorithm, may be used to detect the standard building drawing and obtain the building plan to be identified, which is not limited in the embodiment of the present invention. The standard building drawing refers to a building engineering drawing comprising a plurality of frames (such as small frames corresponding to a building plan, building design description frames and other auxiliary frames).

And S120, after the receptive field of the multi-channel basic image features under each scale is increased by adopting a cavity convolution algorithm, carrying out feature fusion on the multi-channel basic image features under each scale to obtain multi-channel fusion image features.

The hole convolution (Atrous Convolution) algorithm, i.e., the dilation convolution algorithm (or the dilation convolution algorithm), is to inject holes in the standard convolution kernel to increase the receptive field and reduce the computational effort.

The multi-channel fusion image features refer to image features obtained by fusing multi-channel basic image features under multiple scales, and can represent multi-dimensional scale features in a building plan.

Because the feature resolution of the image features at the low-dimensional scale is higher, more detail information is contained, and the feature resolution of the image features at the high-dimensional scale is lower, the detail perception capability is poorer, so that the extracted features can better represent the building plan, and the multi-channel basic image features at different scales can be fused. Before feature fusion is carried out, a cavity convolution algorithm can be adopted, receptive fields of multi-channel basic image features under various scales are increased, context information of the multi-channel basic image features under various scales and a building plan is obtained, then feature fusion is carried out on the multi-channel basic image features processed by the cavity convolution algorithm, feature information of different scales is fused together, multi-channel fusion image features are obtained, further feature information of the graphic elements in the building plan under various scales, especially feature information of the graphic elements blocked by interference of auxiliary lines or characters and the like can be obtained, and graphic element detection performance of the building plan is improved.

S130, obtaining a plurality of segmentation graphs according to the fusion image characteristics of the multiple channels, wherein the primitives included in different segmentation graphs correspond to different kernel scales.

The segmentation map refers to a plurality of feature maps obtained by segmenting the multi-channel fusion image features.

Kernel scale refers to the segmentation scale of the multi-channel fusion image features corresponding to each segmentation map. The segmentation map corresponding to the largest kernel scale is the fusion image characteristic of the multiple channels.

In order to detect the cross overlapped primitives, the edge information of each primitive needs to be accurately detected, therefore, before detection, the multi-channel fusion image feature can be divided into a plurality of division images according to different kernel scales, and the division images obtained by division have the same shape and center point as the multi-channel fusion image feature, but gradually increase in scale so as to merge the primitives in each division image based on a progressive expansion algorithm.

And S140, combining the primitives with different kernel proportions in each divided graph by adopting a progressive expansion algorithm to obtain at least one primitive region, and acquiring a primitive identification result corresponding to the primitive region.

The progressive extension (Progressive Scale Expansion, PSE) algorithm refers to a segmentation-based text detection algorithm, which can detect adjacent text regions, and in the embodiment of the invention, the progressive extension algorithm is adopted, so that any-shaped primitives in a building plan can be positioned, and boundaries of adjacent or partially overlapped primitives can be effectively distinguished.

Kernel scale refers to the contraction scale of the primitives in each segmentation map corresponding to the fused image features of the multiple channels.

The primitive region refers to a region which is obtained after merging the primitives in each divided graph and has the same size as the divided graph with the largest kernel scale, namely the region where each primitive is located in the building plan.

The primitive recognition result refers to a primitive detection result, for example, an area range corresponding to each primitive in the building plan, position information of each primitive area in the building plan, and the like.

The method has the advantages that the progressive expansion algorithm is adopted, the primitives with different kernel proportions in each divided graph are combined to obtain at least one complete primitive region, the boundary of each primitive can be effectively detected, adjacent primitive regions are distinguished, the problems of missing detection and false detection caused by shielding, interference or dense primitive distribution are avoided, and the progressive expansion algorithm is adopted to position the primitive regions to obtain the position information of the primitive regions in the building plan.

As an alternative embodiment, obtaining a plurality of segmentation maps according to the multi-channel fused image features may include: channel fusion is carried out on the multi-channel fusion image characteristics to obtain a target fusion image; inputting a target fusion image into an image segmentation model, and outputting a plurality of segmentation graphs through the image segmentation model, wherein each segmentation graph comprises a mask graph of all primitives with set kernel scales;

Combining the primitives with different kernel proportions in each divided graph by adopting a progressive expansion algorithm to obtain at least one primitive region, which can comprise: and sequentially combining the primitives included in each segmentation graph according to the sequence from small to large of the kernel proportion by adopting a breadth-first method to obtain at least one primitive region.

And the channel fusion is used for fusing the characteristics of the multi-channel fusion image and obtaining the image characteristics with the same channel number (the number of convolution kernels) so as to obtain the target fusion image. For example, the 1*1 convolution check with the set number of channels may be used to perform channel fusion on the multi-channel fusion image features to obtain the target fusion image.

The image segmentation model is used for carrying out image segmentation on the input target fusion image and outputting a plurality of segmented images obtained after segmentation.

The mask diagram refers to a segmentation result of different primitives in a multi-channel fused image feature under a certain scale (namely, a set kernel scale), wherein in the mask diagram, pixels at positions of the primitives are generally black, and pixels at positions other than the primitives are generally white.

The breadth first method (Breadth First Search, BFS) refers to a blind search method, which examines all nodes in an image until a result is found, and in the embodiment of the present invention, the breadth first method is used to merge primitives with different kernel proportions in each divided graph until at least one primitive region is obtained.

And (3) carrying out channel fusion on the multi-channel fusion image characteristics to obtain target fusion images with the same channel number, inputting the target fusion images into an image segmentation model to obtain a plurality of segmentation graphs, and then adopting a breadth-first method to sequentially merge the segmentation graphs with the larger kernel proportion from the segmentation graph with the smallest kernel proportion so as to expand the area of the segmentation graph until the segmentation graph with the largest kernel proportion is expanded, namely gradually expanding the primitives in the segmentation graphs with the small kernel proportion until the primitives in the segmentation graph with the largest kernel proportion are covered, so as to obtain at least one primitive area. The primitive detection area is gradually expanded from small to maximum through a plurality of segmentation graphs, the complete primitive area is obtained, the robustness is realized on the primitives with any shape, and the primitive boundaries close to or even partially overlapped can be rapidly and accurately separated.

According to the technical scheme provided by the embodiment of the invention, the multi-channel basic image features under multiple scales are extracted from the building plan to be identified, the cavity convolution algorithm is adopted, the receptive fields of the multi-channel basic image features under each scale are increased, then the feature fusion is carried out, the multi-channel fusion image features are obtained, then multiple segmentation graphs are obtained according to the multi-channel fusion image features, then the primitives with different kernel proportions in each segmentation graph are combined through the progressive expansion algorithm, at least one primitive area is obtained, the primitive identification result corresponding to the primitive area is obtained, the features extracted from the building plan are processed through the artificial intelligence algorithm, the detection effect of the primitives can be effectively improved, the primitive identification result with high accuracy is obtained, the accurate detection of the primitives in the building plan is realized, and the problems of missed detection and false detection caused by shielding or interference are avoided.

In an optional implementation manner of this embodiment, before extracting the multi-channel basic image feature under the set low-dimensional scale from the building plan to be identified, the method may further include:

pre-identifying a standard building drawing by adopting a morphological algorithm, and intercepting at least one alternative drawing frame detection area from the standard building drawing according to a pre-identification result, wherein the standard building drawing comprises at least one small drawing frame with the image size smaller than or equal to a preset standard identification size; for each candidate frame detection area, the following frame detection processing operations are performed: extracting the basic image characteristics of multiple channels in an alternative frame detection area; extracting multi-channel high-dimension image features from each basic image feature on the basis of keeping the basic image features not missing, and enhancing the image feature quality of each high-dimension image feature; on the basis of keeping the high-dimensional image features not missing, carrying out feature fusion on the multi-channel high-dimensional image features to obtain multi-channel fusion image features; and acquiring a frame identification result of the alternative frame detection area according to the multi-channel fusion image characteristics, and taking each identified frame as a building plan to be identified.

Morphological algorithm refers to an algorithm for analyzing and identifying an image by measuring and extracting corresponding shapes in the image through structural elements with certain shapes. In the embodiment of the invention, a morphological algorithm is used for acquiring frame information of a standard building drawing.

The pre-recognition result refers to a small picture frame with at least one image size which is recognized in the standard building drawing by adopting a morphological algorithm and is smaller than or equal to a preset standard recognition size. The preset standard recognition size refers to a preset maximum size of a frame which can be recognized by a morphological algorithm.

The alternative frame detection area refers to an area cut in a standard building drawing.

Image feature quality, which is used to measure the extent to which image features can characterize the detection area of the candidate frame.

The frame recognition result refers to a result obtained after frame detection is performed on the candidate frame detection area.

A plurality of frames may be included in a standard building drawing, for example, for a standard building drawing of a residential project, where frames such as an image catalog and a building design description may be included, so before performing primitive detection on a building plan, it is further required to intelligently detect a building plan to be identified in the standard building drawing, specifically: firstly, pre-recognizing a standard building drawing by adopting a morphological algorithm to obtain at least one small picture frame smaller than or equal to a preset standard recognition size, then intercepting at least one alternative picture frame detection area in the standard building drawing according to the small picture frame obtained by recognition, and carrying out picture frame detection on the area to determine whether the picture frame is a building plan to be recognized.

The frame detection of the alternative frame detection area specifically comprises the following steps: firstly, convolution processing can be carried out on the candidate frame detection areas by adopting convolution check with different channel numbers, the basic image characteristics of multiple channels are extracted, secondly, the high-dimensional image characteristics of multiple channels are extracted from the basic image characteristics, the image characteristic quality is enhanced, then, the high-dimensional image characteristics of multiple channels are subjected to characteristic fusion to obtain the fusion image characteristics of multiple channels, finally, the frame recognition result of the candidate frame detection areas is obtained according to the fusion image characteristics of multiple channels, and each recognized frame is used as a building plan to be recognized.

The advantage of the arrangement is that at least one small picture frame can be identified in the standard building drawing by adopting a morphological algorithm, so that the problem that the detection model cannot be identified correctly due to the fact that the standard building drawing with very high image resolution is directly input into the detection model is avoided, and the resolution of the image input into the detection model is reduced; and then intercepting at least one alternative picture frame detection area in the standard building drawing according to the identified small picture frame, and intelligently detecting the alternative picture frame detection area, so that the building plan to be identified is determined, manual drawing audit is not required by professional staff, the searching speed of the building plan to be identified in the standard building drawing is improved, and automatic drawing audit and picture frame information searching are realized.

On the basis of the above embodiments, pre-identifying the standard building drawing by using a morphological algorithm, and intercepting at least one alternative frame detection area from the standard building drawing according to the pre-identifying result, which may include:

performing binarization processing on the standard building drawing to obtain a binarized image; performing corrosion and/or expansion treatment on the binarized image to smooth the object boundary in the binarized image; performing edge point detection on the processed binarized image to obtain a plurality of edge points, and performing connected domain detection according to each detected edge point to obtain the position coordinate range of each detected connected domain in the binarized image; and intercepting and obtaining alternative frame detection areas corresponding to the connected domains respectively in a standard building drawing according to the position coordinate ranges.

On the basis of the above embodiments, in the candidate frame detection area, extracting the base image features of the multiple channels may include:

inputting the alternative frame detection area into a lightweight network, and inputting output results of a plurality of bottleneck layers of the lightweight network into a path aggregation network to obtain multi-channel basic image characteristics;

Wherein, different bottleneck layers are used for outputting basic image characteristics of different scales.

On the basis of the above embodiments, on the basis of keeping the basic image features not missing, extracting multi-channel high-dimensional image features from the basic image features may include:

inputting the multi-channel basic image features into a spatial pyramid pooling network, and extracting the multi-channel high-dimensional image features with standard dimensions from the multi-scale multi-channel basic image features through the spatial pyramid pooling network.

On the basis of the above embodiments, enhancing the image feature quality of each high-dimensional image feature may include:

and inputting the multi-channel high-dimensional image features into a sub-pixel convolution network, and respectively inserting each low-resolution high-dimensional image feature into a high-resolution feature spectrum through the sub-pixel convolution network so as to enhance the feature quality of each high-dimensional image feature.

On the basis of the above embodiments, on the basis of keeping the high-dimensional image features not missing, performing feature fusion on the high-dimensional image features of the multiple channels to obtain fused image features of the multiple channels, which may include:

And performing convolution processing on the high-dimensional image features of the multiple channels by using a set number of convolution cores 1*1 to obtain fusion image features of the multiple channels.

Based on the above embodiments, obtaining the frame recognition result of the candidate frame detection area according to the multi-channel fused image feature may include:

and respectively inputting the multi-channel fusion image characteristics into a classification network and a positioning network, and identifying the region position coordinates of the picture frame in the candidate picture frame detection region through the classification result output by the classification network and the positioning result output by the positioning network.

On the basis of the above embodiments, performing each frame detection processing operation for each candidate frame detection area may specifically include:

inputting each alternative frame detection region into a pre-trained frame recognition model respectively, and acquiring a frame recognition result output by the frame recognition model aiming at each alternative frame detection region;

the frame recognition model specifically comprises the following steps: a lightweight network, a path aggregation network, a spatial pyramid pooling network, a sub-pixel convolution network, a convolution kernel of 1*1, a classification network, and a positioning network;

The training samples used in training the frame recognition model comprise: standard building drawings of the frame position of each building plan are marked in advance.

Example two

Fig. 2a is a flowchart of a primitive detection method in a building plan according to a second embodiment of the present invention. The embodiment is embodied on the basis of the foregoing embodiment, where the extracting, from the building plan to be identified, the multi-channel base image feature under multiple scales may be specifically:

inputting the building plan into a residual error network, and obtaining output results of a plurality of residual error blocks of the residual error network as multi-channel basic image characteristics under a plurality of scales;

wherein each residual block is used for outputting multichannel basic image characteristics under a set scale.

Further, before the cavity convolution algorithm is adopted to increase the receptive field of the multi-channel basic image features under each scale, the method further comprises the following steps:

and respectively carrying out up-sampling processing on the multi-channel basic image characteristics under each scale so as to increase high-dimensional characteristics in the multi-channel basic image characteristics.

As shown in fig. 2a, the method for detecting primitives in a building plan provided in this embodiment specifically includes:

S210, inputting the building plan into a residual network, and obtaining output results of a plurality of residual blocks of the residual network as multi-channel basic image features under a plurality of scales.

A Residual Network (res) for extracting multi-scale image features in a building plan.

Residual blocks (ResBlock), referring to basic structural units in the residual network, each residual block is used to output multi-channel base image features at a set scale.

It will be appreciated that a plurality of residual blocks may be provided in one residual network, different residual blocks may output image features at different scales. The building plan is input into the residual network, and the image features under different set scales can be respectively output, so that a plurality of multi-channel basic image features under different scales can be obtained.

S220, respectively carrying out up-sampling processing on the multi-channel basic image features under each scale so as to increase high-dimensional features in the multi-channel basic image features.

The up-sampling is used for adding high-dimensional feature information and aggregating image semantic information in the multi-channel basic image features.

For example, after the multi-channel basic image features under each scale sequentially pass through the convolution kernels of 1*1 of the preset channel number, 2 times of up-sampling processing is performed, so that on the basis of ensuring that the multi-channel basic image features are not missing, high-dimensional features in the multi-channel basic image features are added, the multi-channel basic image features can contain features related to image semantics, and further, the primitive detection effect in the building plan is improved.

S230, after the receptive field of the multi-channel basic image features under each scale is increased by adopting a cavity convolution algorithm, the multi-channel basic image features under each scale are subjected to feature fusion, and multi-channel fusion image features are obtained.

Optionally, a cavity convolution algorithm is adopted to increase the receptive field of the multi-channel basic image features under each scale, which may include: acquiring cavity convolution ratios respectively corresponding to the multi-channel basic image features of each scale; and respectively carrying out convolution operation on the convolution kernel of each cavity convolution ratio and the matched multichannel basic image features so as to increase the receptive field of the multichannel basic image features under each scale.

Wherein the void convolution ratio, i.e., the expansion ratio (or expansion ratio), refers to the number of intervals of points of the convolution kernel, and is used to represent the magnitude of the increase in the receptive field. For smaller cavity convolution ratio, the receptive field is smaller, which is beneficial to detecting small targets; for larger cavity convolution ratio, the receptive field is larger, which is beneficial to detecting a large target. For example, the hole convolution ratio is 1, points representing convolution kernels are adjacent to each other, and the ratio corresponds to a general convolution; the hole convolution rate is not 1, and taking the hole convolution rate of 2 as an example, the interval between points representing the convolution kernels is one pixel, namely, the 3*3 convolution kernels of the hole convolution with the hole convolution rate of 2 have the same receptive field as the 5*5 convolution kernels of the general convolution.

In order to increase the receptive field of the multi-channel basic image features under each scale, the receptive field can cover the area corresponding to the whole multi-channel basic image features in a non-blind area mode, so that the cavity convolution ratios corresponding to the multi-channel basic image features of each scale can be obtained, and convolution operation is carried out on the convolution kernel of each cavity convolution ratio and the matched multi-channel basic image features.

Optionally, feature fusion is performed on the multi-channel basic image features under each scale to obtain multi-channel fused image features, which may include: the multi-channel basic image features under each scale are input into a merging network together to obtain the merged image features of the first channel number; and (3) performing convolution processing on the fusion image characteristics of the first channel number by using a convolution check of 1*1 of the set channel number to obtain the fusion image characteristics of the second channel number.

Merging network refers to a network layer for feature fusion.

The fused image features of the first channel number refer to image features obtained by fusing multi-channel basic image features under a multi-scale.

The fused image features of the second channel number refer to image features with specific dimensions after information integration of the fused image features of the first channel number on the basis of ensuring that the fused image features of the first channel number are not missing.

Because the feature resolution of the image features at the low-dimensional scale is higher and contains more detail information, and the feature resolution of the image features at the high-dimensional scale is lower and has poorer detail perception capability, in the embodiment of the invention, in order to enable the extracted features to better describe the graphic elements in the building plan, the multi-channel basic image features at different scales can be fused, the edge detection effect is improved, and the graphic element detection performance of the building plan is further improved; and the convolution processing can be performed on the fused image features of the first channel number, so that the image features of the low-dimensional scale or the high-dimensional scale are reserved, and the convolution processing of 256 channels and 1*1 convolution kernel is performed on the fused image features of the first channel number to obtain the fused image features of the second channel number, so that the image features of the set scale can be reserved, and the defect of the fused image features of the first channel number can be avoided.

S240, obtaining a plurality of segmentation graphs according to the fusion image characteristics of the multiple channels, wherein the primitives included in different segmentation graphs correspond to different kernel scales.

S250, combining the primitives with different kernel proportions in each divided graph by adopting a progressive expansion algorithm to obtain at least one primitive region, and acquiring a primitive identification result corresponding to the primitive region.

As a specific embodiment, fig. 2b provides a schematic diagram of a model structure for obtaining the primitive recognition result of the building plan. Firstly, inputting a building plan to be identified into a residual network, wherein the output results of a first residual block ResBlock1, a second residual block ResBlock2, a fourth residual block ResBlock4 and a sixth residual block ResBlock6 in the residual network ResNet are respectively 1/4, 1/8, 1/16 and 1/32 of the resolution of the input building plan, so that the output results of the four residual blocks can be obtained and used as multi-channel basic image characteristics under four scales; secondly, carrying out convolution processing on the multi-channel basic image features under four scales by adopting 1*1 convolution cores with the channel numbers of 16, 32, 125 and 256 respectively, carrying out 2-time up-sampling processing on the multi-channel basic image features obtained after the convolution processing respectively, and adding high-dimension features in the multi-channel basic image features on the basis of ensuring that the multi-channel basic image features are not lost; then, adopting a 1*1 standard convolution and 3*3 convolution kernels with three cavity convolution ratios of 6, 12 and 18 respectively, carrying out convolution operation on the convolution kernels and the matched multi-channel basic image features respectively, adding the receptive field of the multi-channel basic image features under each scale on the basis of ensuring that the multi-channel basic image features are not missing, acquiring multi-scale information, inputting the multi-channel basic image features under each scale into a merging network Comcat together to obtain the fused image features of the first channel number, carrying out convolution processing on the fused image features of the first channel number by using a 1*1 convolution kernel with the set channel number of 256, and obtaining the fused image features of the second channel number on the basis of ensuring that the fused image features of the first channel number are not missing; finally, according to the fusion image characteristics of multiple channels, obtaining multiple segmentation graphs, adopting a progressive expansion algorithm to combine the primitives with different kernel proportions in each segmentation graph to obtain at least one primitive region, and outputting a primitive identification result corresponding to the primitive region.

The present embodiment is not explained in detail herein, and reference is made to the foregoing embodiments.

According to the technical scheme, the building plan is input into the residual network, the output results of the residual blocks are used as multi-channel basic image features under a plurality of scales, the multi-channel basic image features under each scale are respectively subjected to up-sampling treatment, then a cavity convolution algorithm is adopted, feature fusion is carried out after the receptive fields of the multi-channel basic image features under each scale are increased, multi-channel fusion image features are obtained, high-dimensional features in the features are effectively increased, more feature detail information is obtained, and feature resolution and the detection accuracy of the graphic elements are improved; and the segmentation map of the multi-channel fusion image characteristic is combined by adopting a progressive expansion algorithm to obtain at least one primitive region, and a primitive identification result corresponding to the primitive region is obtained, so that boundary pixel values of the close or intersected primitives can be effectively separated, accurate detection of the primitives in the building plan is realized, and the problems of missing detection and false detection caused by shielding or interference are avoided.

Example III

Fig. 3 is a schematic structural diagram of a primitive detection device in a building plan according to a third embodiment of the present invention, where the embodiment of the present invention is applicable to a case of accurately detecting primitives in a building plan, and the device may be implemented in a software and/or hardware manner and may be generally integrated in a computer device.

As shown in fig. 3, the primitive detection device in the building plan specifically includes: the system comprises a multi-channel basic image feature extraction module 310, a multi-channel fusion image feature generation module 320, a segmentation map generation module 330 and a primitive identification result acquisition module 340. Wherein,,

a multi-channel base image feature extraction module 310, configured to extract multi-channel base image features under multiple scales from a building plan to be identified;

the multi-channel fusion image feature generation module 320 is configured to perform feature fusion on the multi-channel basic image features under each scale after the receptive field of the multi-channel basic image features under each scale is increased by using a cavity convolution algorithm, so as to obtain multi-channel fusion image features;

the segmentation map generating module 330 is configured to obtain a plurality of segmentation maps according to the fused image features of the multiple channels, where primitives included in different segmentation maps correspond to different kernel scales;

the primitive identification result obtaining module 340 is configured to combine primitives with different kernel proportions in each divided graph by using a progressive extension algorithm to obtain at least one primitive region, and obtain a primitive identification result corresponding to the primitive region.

Optionally, the multi-channel basic image feature extraction module 310 is specifically configured to input the building plan into a residual network, and obtain output results of a plurality of residual blocks of the residual network as multi-channel basic image features under the plurality of scales; wherein each residual block is used for outputting multichannel basic image characteristics under a set scale.

Optionally, the apparatus further includes: and the up-sampling processing module is used for respectively carrying out up-sampling processing on the multi-channel basic image characteristics under each scale before the receptive field of the multi-channel basic image characteristics under each scale is increased by adopting a cavity convolution algorithm so as to increase the high-dimensional characteristics in the multi-channel basic image characteristics.

Optionally, the multi-channel fusion image feature generating module 320 is specifically configured to obtain a cavity convolution ratio corresponding to each scale of multi-channel base image feature; and respectively carrying out convolution operation on the convolution kernel of each cavity convolution ratio and the matched multichannel basic image features to increase the receptive field of the multichannel basic image features under each scale, and then carrying out feature fusion on the multichannel basic image features under each scale to obtain the multichannel fused image features.

Optionally, the multi-channel fused image feature generating module 320 is specifically configured to, after adding the receptive field of the multi-channel basic image features under each scale by using a hole convolution algorithm, input the multi-channel basic image features under each scale into the merging network together to obtain a fused image feature of the first channel number; and (3) performing convolution processing on the fusion image characteristics of the first channel number by using a convolution check of 1*1 of the set channel number to obtain the fusion image characteristics of the second channel number.

Optionally, the segmentation map generating module 330 is specifically configured to perform channel fusion on the multi-channel fusion image features to obtain a target fusion image; inputting the target fusion image into an image segmentation model, and outputting a plurality of segmentation graphs through the image segmentation model, wherein each segmentation graph comprises a mask graph of all primitives with set kernel scales;

the primitive identification result obtaining module 340 is specifically configured to sequentially combine the primitives included in each divided map according to the order from small to large of the kernel proportion by adopting a breadth-first method, obtain at least one primitive region, and obtain a primitive identification result corresponding to the primitive region.

Optionally, the apparatus further includes: the system comprises a frame detection module in a building drawing, wherein the frame detection module in the building drawing is used for pre-identifying a standard building drawing by adopting a morphological algorithm before multi-channel basic image features under multiple scales are extracted from a building plan to be identified, and intercepting at least one alternative frame detection area in the standard building drawing according to a pre-identification result, wherein the standard building drawing comprises at least one small frame with an image size smaller than or equal to a preset standard identification size;

For each candidate frame detection area, the following frame detection processing operations are performed: extracting the basic image characteristics of multiple channels in an alternative frame detection area; extracting multi-channel high-dimension image features from each basic image feature on the basis of keeping the basic image features not missing, and enhancing the image feature quality of each high-dimension image feature; on the basis of keeping the high-dimensional image features not missing, carrying out feature fusion on the multi-channel high-dimensional image features to obtain multi-channel fusion image features; and acquiring a frame identification result of the alternative frame detection area according to the multi-channel fusion image characteristics, and taking each identified frame as a building plan to be identified.

The graphic element detection device in the building plan can execute the graphic element detection method in the building plan provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of executing the graphic element detection method in the building plan.

Example IV

Fig. 4 is a schematic hardware structure of a computer device according to a fourth embodiment of the present invention. Fig. 4 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in fig. 4 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in FIG. 4, the computer device 12 is in the form of a general purpose computing device. Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.

Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Computer device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard disk drive"). Although not shown in fig. 4, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. The system memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, system memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.

The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with the computer device 12, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18. It should be appreciated that although not shown in fig. 4, other hardware and/or software modules may be used in connection with computer device 12, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, to implement a primitive detection method in a building plan provided by an embodiment of the present invention. That is, the processing unit realizes when executing the program:

Example five

A fifth embodiment of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a primitive detection method in a building plan as provided in all the inventive embodiments of the present application: that is, the program, when executed by the processor, implements:

Any combination of one or more computer readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (e.g., connected through the internet using an internet service provider).

Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims

1. A method for detecting primitives in a building plan, comprising:

obtaining a plurality of segmentation graphs according to the fusion image characteristics of the multiple channels, wherein the primitives included in different segmentation graphs correspond to different kernel scales; the graphic element refers to a member in a building plan and comprises various elements for constructing a building; the kernel scale refers to the segmentation scale of the multi-channel fusion image features corresponding to each segmentation map;

Combining the primitives with different kernel proportions in each divided graph by adopting a progressive expansion algorithm to obtain at least one primitive region, and acquiring a primitive identification result corresponding to the primitive region;

before extracting the multi-channel basic image features under multiple scales from the building plan to be identified, the method further comprises the following steps:

pre-identifying a standard building drawing by adopting a morphological algorithm, and intercepting at least one alternative frame detection area from the standard building drawing according to a pre-identification result, wherein the standard building drawing comprises at least one small frame with the image size smaller than or equal to a preset standard identification size;

for each candidate frame detection area, the following frame detection processing operations are performed:

extracting the basic image characteristics of multiple channels in an alternative frame detection area;

extracting multi-channel high-dimension image features from each basic image feature on the basis of keeping the basic image features not missing, and enhancing the image feature quality of each high-dimension image feature;

on the basis of keeping the high-dimensional image features not missing, carrying out feature fusion on the multi-channel high-dimensional image features to obtain multi-channel fusion image features;

And acquiring a frame identification result of the alternative frame detection area according to the multi-channel fusion image characteristics, and taking each identified frame as a building plan to be identified.

2. The method of claim 1, wherein extracting multi-channel basis image features at multiple scales in a building plan to be identified comprises:

inputting the building plan into a residual network, and obtaining output results of a plurality of residual blocks of the residual network as multichannel basic image features under a plurality of scales;

3. The method of claim 1, further comprising, prior to increasing the receptive field of the multi-channel base image features at each scale using a hole convolution algorithm:

4. The method of claim 1, wherein increasing the receptive field of the multi-channel base image features at each scale using a hole convolution algorithm comprises:

Acquiring cavity convolution ratios respectively corresponding to the multi-channel basic image features of each scale;

and respectively carrying out convolution operation on the convolution kernel of each cavity convolution ratio and the matched multichannel basic image features so as to increase the receptive field of the multichannel basic image features under each scale.

5. The method of claim 1, wherein feature fusion is performed on the multi-channel base image features at each scale to obtain multi-channel fused image features, comprising:

the multi-channel basic image features under each scale are input into a merging network together to obtain the merged image features of the first channel number;

and (3) performing convolution processing on the fusion image characteristics of the first channel number by using a convolution check of 1*1 of the set channel number to obtain the fusion image characteristics of the second channel number.

6. The method of claim 1, wherein obtaining a plurality of segmentation maps from the multi-channel fused image features comprises:

channel fusion is carried out on the multi-channel fusion image characteristics to obtain a target fusion image;

inputting the target fusion image into an image segmentation model, and outputting a plurality of segmentation graphs through the image segmentation model, wherein each segmentation graph comprises a mask graph of all primitives with set kernel scales;

Combining the primitives with different kernel proportions in each divided graph by adopting a progressive expansion algorithm to obtain at least one primitive region, wherein the method comprises the following steps:

and sequentially combining the primitives included in each segmentation graph according to the sequence from small to large of the kernel proportion by adopting a breadth-first method to obtain at least one primitive region.

7. A primitive detection device in a building plan, comprising:

the segmentation map generation module is used for obtaining a plurality of segmentation maps according to the fusion image characteristics of the multiple channels, and the primitives included in different segmentation maps correspond to different kernel scales; the graphic element refers to a member in a building plan and comprises various elements for constructing a building; the kernel scale refers to the segmentation scale of the multi-channel fusion image features corresponding to each segmentation map;

The primitive identification result acquisition module is used for combining the primitives with different kernel proportions in each divided graph by adopting a progressive expansion algorithm to obtain at least one primitive region and acquiring a primitive identification result corresponding to the primitive region;

the multi-channel basic image feature extraction module is further used for pre-identifying a standard building drawing by adopting a morphological algorithm, and intercepting at least one alternative frame detection area from the standard building drawing according to a pre-identification result, wherein the standard building drawing comprises at least one small frame with the image size smaller than or equal to a preset standard identification size; for each candidate frame detection area, the following frame detection processing operations are performed: extracting the basic image characteristics of multiple channels in an alternative frame detection area; extracting multi-channel high-dimension image features from each basic image feature on the basis of keeping the basic image features not missing, and enhancing the image feature quality of each high-dimension image feature; on the basis of keeping the high-dimensional image features not missing, carrying out feature fusion on the multi-channel high-dimensional image features to obtain multi-channel fusion image features; and acquiring a frame identification result of the alternative frame detection area according to the multi-channel fusion image characteristics, and taking each identified frame as a building plan to be identified.

8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1-6 when the program is executed by the processor.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-6.