CN113591939A

CN113591939A - Layer classification method and device

Info

Publication number: CN113591939A
Application number: CN202110782277.4A
Authority: CN
Inventors: 崔淼; 陈成才
Original assignee: Shanghai Xiaoi Robot Technology Co Ltd
Current assignee: Shanghai Xiaoi Robot Technology Co Ltd
Priority date: 2021-07-09
Filing date: 2021-07-09
Publication date: 2021-11-02

Abstract

The application provides a layer classification method and a device, and the method comprises the following steps: acquiring a characteristic diagram of an input image, wherein the input image is a building image; determining a multi-scale feature map of each layer in the input image according to the feature map; performing feature enhancement on the multi-scale feature map based on the context information of each map layer to obtain a target feature map; and carrying out layer classification based on the target characteristic graph to obtain a classification result of each layer. The method in the embodiment of the application can improve the accuracy of layer classification.

Description

Layer classification method and device

Technical Field

The application relates to the technical field of image recognition, in particular to a layer classification method and device.

Background

The building design industry generally uses Computer Aided Design (CAD) software to draw a building design drawing, and the drawn building design drawing needs to be reviewed to determine whether a place which violates the national standard exists. At present, the building design drawing is checked mainly by experienced engineers, but the checking task has large workload and low efficiency. Therefore, it is very urgent to use computers instead of human beings for automatic image examination.

The architectural design often includes multiple layers, for example, a single architectural design may include various elements (or also referred to as primitives) such as load-bearing walls, light walls, columns, doors, windows, furniture, fire protection, plumbing, and floors, which may be located in the multiple layers of the architectural design. In order to realize automatic map examination, each layer in the architectural design map needs to be accurately detected, so how to improve the accuracy of layer classification of the architectural design map becomes a technical problem which needs to be solved urgently.

Disclosure of Invention

In view of this, embodiments of the present application are directed to providing a method and an apparatus for layer classification, which can improve accuracy of layer classification.

In a first aspect, a method for layer classification is provided, including: acquiring a characteristic diagram of an input image, wherein the input image is a building image; determining a multi-scale feature map of each layer in the input image according to the feature map; performing feature enhancement on the multi-scale feature map based on the context information of each map layer to obtain a target feature map; and carrying out layer classification based on the target characteristic graph to obtain a classification result of each layer.

In the embodiment of the application, the multi-scale feature map of each layer in the input image is determined according to the feature map, so that the receptive field of a model (namely, a layer classification model) can be improved, the feature of the multi-scale feature map is enhanced, the context information of each layer can be integrated into the obtained target feature map, and at the moment, the layer classification is performed based on the target feature map, so that a more accurate layer classification result can be obtained.

In some embodiments, before the obtaining the feature map of the input image, the method further comprises: determining the boundary position of each graphic element in each layer by a morphological method; determining a thumbnail of the input image based on the boundary position of each primitive; the acquiring of the feature map of the input image includes: determining the feature map based on the thumbnail.

In the embodiment of the application, the thumbnail of the input image is determined based on the boundary position of each primitive, and the feature map is determined based on the thumbnail, so that the loss of feature information caused by feature extraction of the high-resolution input image can be avoided, the information contained in the feature map is more complete, and the accuracy of layer classification can be improved.

In some embodiments, the obtaining the feature map of the input image includes: and extracting a feature map of the input image by using a lightweight network.

In the embodiment of the application, the characteristic diagram of the input image is extracted by using the lightweight network, so that the running speed of a model (namely, a layer classification model) can be increased, and the efficiency of layer classification can be improved.

In some embodiments, the lightweight network consists of point convolutions, depth separable convolutions, and full pooling.

In some embodiments, the determining a multi-scale feature map of each layer in the input image according to the feature map includes: and performing pooling treatment on the feature map by using a void space pyramid network to obtain the multi-scale feature map.

In the embodiment of the application, the multi-scale feature map of each layer in the input image is determined, the global features and the local features of each layer in the input image can be simultaneously obtained, the receptive field of a model (namely, a layer classification model) can be improved, and therefore the accuracy of layer classification can be improved.

In some embodiments, the performing, based on the context information of each layer, feature enhancement on the multi-scale feature map to obtain a target feature map includes: and performing feature enhancement on the multi-scale feature map based on the context information of each map layer by using a deep learning model to obtain the target feature map, wherein the deep learning model is composed of an encoder, a decoder and an attention mechanism.

In the embodiment of the application, the feature enhancement is performed on the multi-scale feature map, so that the context information of each layer can be blended into the obtained target feature map, and the accuracy of layer classification can be improved.

In a second aspect, an apparatus for layer classification is provided, including: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a feature map of an input image, and the input image is a layer image; the determining unit is used for determining the multi-scale feature map of each layer in the input image according to the feature map; the characteristic enhancement unit is used for carrying out characteristic enhancement on the multi-scale characteristic graph based on the context information of each layer to obtain a target characteristic graph; and the layer classification unit is used for carrying out layer classification based on the target characteristic graph to obtain a classification result of each layer.

In a third aspect, the present invention provides an apparatus for layer classification, where the apparatus is configured to perform the method in the first aspect or any possible implementation manner of the first aspect.

In a fourth aspect, an apparatus for layer classification is provided, where the apparatus includes a storage medium and a processor, where the storage medium may be a non-volatile storage medium, where a computer-executable program is stored in the storage medium, and the processor is connected to the non-volatile storage medium and executes the computer-executable program to implement the first aspect or the method in any possible implementation manner of the first aspect.

In a fifth aspect, a chip is provided, where the chip includes a processor and a data interface, and the processor reads instructions stored in a memory through the data interface to perform the method of the first aspect or any possible implementation manner of the first aspect.

Optionally, as an implementation manner, the chip may further include a memory, where instructions are stored in the memory, and the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the processor is configured to execute the first aspect or the method in any possible implementation manner of the first aspect.

A sixth aspect provides a computer readable storage medium storing program code for execution by a device, the program code comprising instructions for performing the method of the first aspect or any possible implementation of the first aspect.

Drawings

Fig. 1 is a diagram of an application scenario applicable to the embodiment of the present application.

Fig. 2 is a schematic block diagram of a layer classification method in an embodiment of the present application.

Fig. 3 is a schematic block diagram of a layer classification method in another embodiment of the present application.

Fig. 4 is a schematic structural diagram of a layer classification model in an embodiment of the present application.

Fig. 5 is a schematic block diagram of an apparatus for layer classification in an embodiment of the present application.

Fig. 6 is a schematic block diagram of an apparatus for layer classification in another embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The method in the embodiment of the present application may be applied to various scenes in which an image to be processed is processed, and is not limited in the embodiment of the present application. For example, the method in the embodiment of the present application may be applied to a scene in which layer classification is performed on a building image.

Fig. 1 is a diagram of an application scenario applicable to the embodiment of the present application. The application scenario 100 in fig. 1 may include an image to be processed 110 and an image processing device 120.

It should be noted that the application scenario shown in fig. 1 is only an example and is not limited, and more or fewer devices or apparatuses may be included in the application scenario shown in fig. 1, which is not limited in the embodiment of the present application.

The image to be processed 110 may be an image obtained based on Computer Aided Design (CAD), where the image to be processed 110 may include a plurality of layers, and each layer may include one or more primitives (or may also be referred to as elements). For example, the image to be processed 110 may be an architectural image (or may also be referred to as an architectural design image) drawn using AutoCAD software. Alternatively, the image to be processed 110 may also be an architectural image drawn by using other methods or other CAD software, and the type, format, and the like of the architectural image are not limited in the embodiment of the present application.

The image processing apparatus 120 may be a computer device, a server (e.g., a cloud server), or other apparatuses or devices capable of performing image processing on an image to be processed (e.g., performing layer classification based on an architectural image).

For example, the image processing apparatus 120 may be a computer device, and the computer device may be a general-purpose computer or a computer device composed of an application-specific integrated circuit, and the like, which is not limited in this embodiment of the application.

Those skilled in the art will appreciate that the number of the above-described computer devices may be one or more than one, and that the types of the plurality of computer devices may be the same or different. In the embodiment of the present application, the number of terminals and the type of the device are not limited.

The computer device may be deployed with a neural network model, and configured to perform image processing on an image to be processed, so as to obtain an image processing result indicating the image to be processed.

For example, the computer device may perform layer classification on the building image (i.e., the image to be processed) through a neural network model deployed therein (e.g., the neural network model may be a layer classification model), so as to obtain a classification result of a plurality of layers in the building image.

The computer equipment can be a server or a cloud server and directly performs image processing on the image to be processed.

Alternatively, the computer device may be connected to a server (not shown in fig. 1) via a communication network. The computer device may send an image to be processed thereof to the server, perform image processing on the image to be processed by using the neural network model in the server, and store an obtained image processing result (such as a classification result of a plurality of image layers in the image to be processed) as a sample image, so as to train the neural network model in the server, thereby obtaining the neural network model for performing the image processing.

The computer device may further obtain an image to be processed from the server, and further perform image processing on the image to be processed through the neural network model to obtain an image processing result of the image to be processed (for example, a classification result of a plurality of layers in the image to be processed).

In order to realize automatic map examination, each layer in the architectural design drawing needs to be accurately detected, however, the architectural design drawing usually includes a plurality of layers, and the architectural design drawing in engineering usually comes from a plurality of design houses, which may not have a unified design standard, designers in different design houses can draw architectural design drawings with different formats according to their own habits, which all cause difficulty in classifying the layers of the architectural design drawing.

Based on the above problem, the embodiment of the application provides a layer classification method, which can improve the accuracy of layer classification.

FIG. 2 is a schematic block diagram of a method 200 for layer classification according to an embodiment of the present application. The method 200 may be performed by the image processing apparatus 120 in fig. 1, and it should be understood that fig. 2 shows the steps or operations of the method 200, but these steps or operations are merely examples, and other operations or variations of the operations of the method 200 in fig. 2 may also be performed by embodiments of the present application, or not all of the steps need to be performed, or the steps may be performed in other orders. The method 200 may include steps S210 to S240, which are specifically as follows:

and S210, acquiring a characteristic diagram of the input image.

Wherein the input image may be the image to be processed 110 in fig. 1. Optionally, the input image may be an architectural image, the input image may include one or more layers, and each layer may include one or more primitives (or may also be referred to as elements).

For example, the input image may include a plurality of different primitives such as load-bearing walls, light walls, pillars, doors, windows, furniture, fire-fighting pipes, plumbing, floors, refrigerators, dishwashers, sinks, or appliances, which may be respectively located in a plurality of different layers in the input image.

In some embodiments, a lightweight network may be used to extract the feature map of the input image, so that the running speed of a model (i.e., a layer classification model) may be increased, and thus the efficiency of layer classification may be improved.

Alternatively, the lightweight network may consist of point convolution, depth separable convolution, and full pooling. For example, the lightweight network may be a GhostNet network. Of course, other lightweight networks may be used in the embodiment of the present application, and this is not limited in the embodiment of the present application.

For example, a front 2-layer bottleneck layer (bottle layer) of the GhostNet network may be selected as a backbone network, and a 2 nd layer of the bottleneck layer may be used as a base layer to extract the feature map of the input image (or the thumbnail). The specific embodiment may be as described in the method 300 in fig. 3, and is not described herein again.

In some embodiments, before S210, the boundary position of each primitive in each layer may be determined by a morphological method, and a thumbnail of the input image may be further determined based on the boundary position of each primitive.

The morphological method may include processing the input image by erosion, expansion, binarization, edge detection, and the like, and the specific processing method may refer to the prior art, which is not described in this embodiment.

Alternatively, the resolution of the thumbnail may be smaller than the input image. For example, the input image may be a high resolution image with a resolution of 40000 × 55000, and the resulting thumbnail may be a low resolution image with a resolution of 640 × 640.

For example, before S210, a morphological method may be used to perform boundary detection on each primitive in the input image, so as to obtain a boundary position of each primitive; then, clipping the input image based on the boundary position of each primitive to obtain an image corresponding to each primitive (compared with the input image, the image corresponding to each primitive is a small-resolution image); next, the images corresponding to the primitives may be further pieced together into a preset image, so as to obtain a thumbnail of the input image, where the resolution of the preset image may be smaller than that of the input image.

The preset image may be a blank image, for example, the values of the elements in the preset image may be 0 or 255. Of course, the preset image may also be an image with a resolution smaller than that of the input image, which is not limited in the embodiment of the present application.

Further, in some embodiments, the feature map may be determined based on the thumbnail. For example, the thumbnail may be subjected to feature extraction using a lightweight network to obtain the feature map.

Therefore, the loss of the feature information caused by feature extraction of the high-resolution input image can be avoided, so that the information contained in the feature map is more complete, and the accuracy of layer classification can be improved.

S220, determining the multi-scale feature map of each layer in the input image according to the feature map.

In some embodiments, feature extraction may be performed on the feature map using a Feature Pyramid Network (FPN), so as to obtain a multi-scale feature map of each layer in the input image.

For example, the feature map may be pooled using a hole space pyramid (ASPP) network to obtain the multi-scale feature map, where the hole space pyramid network may include 4 layers (4 convolutional layers), and the hole convolution ratios of the 4 layers may be set to 8, 12, 16, and 32, respectively. The specific embodiment may be as described in the method 300 in fig. 3, and is not described herein again.

And S230, performing feature enhancement on the multi-scale feature map based on the context information of each map layer to obtain a target feature map.

In some embodiments, a deep learning model may be used to perform feature enhancement on the multi-scale feature map based on the context information of each image layer, so as to obtain the target feature map.

Wherein, the deep learning model can be composed of an encoder, a decoder and an attention mechanism. For example, the deep learning model may be a transform model. Of course, the deep learning model may also be another model capable of extracting context information of each layer in the input image, which is not limited in this embodiment of the application.

S240, carrying out layer classification based on the target feature map to obtain a classification result of each layer.

In some embodiments, a classifier may be used to perform layer classification based on the target feature map, so as to obtain a classification result of each layer.

The classifier may be a Support Vector Machine (SVM) or a random forest, and the type of the classifier is not limited in this embodiment.

The following describes the layer classification method in the embodiment of the present application, with reference to fig. 3, by taking a specific layer classification model as an example.

FIG. 3 is a schematic block diagram of a method 300 for layer classification according to an embodiment of the present application. The method 300 may be performed by the image processing apparatus 120 in fig. 1, and it should be understood that fig. 3 shows the steps or operations of the method 300, but these steps or operations are merely examples, and other operations or variations of the operations of the method 300 in fig. 3 may also be performed by embodiments of the present application, or not all of the steps need to be performed, or the steps may be performed in other orders. The method 300 may include steps S310 to S350, specifically as follows:

s310, preprocessing an input image to obtain a thumbnail of the input image.

The input image may be a building image, the input image may include one or more layers, and each layer may include one or more primitives. Alternatively, the resolution of the thumbnail may be smaller than the input image.

Optionally, the input image may be preprocessed by a morphological method to obtain the thumbnail. The morphological method may include processing methods such as corrosion, expansion, binarization, edge detection and the like. The specific method for preprocessing the input image may refer to the embodiment of the method 200 in fig. 2, and is not described herein again.

It should be noted that the step S310 may also be executed by the image processing apparatus 120 in fig. 1; alternatively, optionally, the step S310 may also be executed by a unit or a module in the layer classification model, that is, the layer classification model in the embodiment of the present application may also include a unit or a module (not shown in the layer classification model shown in fig. 4) for executing the step S310.

And S320, performing feature extraction on the thumbnail to obtain a feature map.

The feature map may include feature information of each layer in the input image.

Optionally, a lightweight network may be used to perform feature extraction on the thumbnail to obtain the feature map, so that the running speed of a model (i.e., a layer classification model) may be increased, and thus the efficiency of layer classification may be increased.

Optionally, the lightweight network may be a GhostNet network.

For example, as shown in fig. 4, a specific layer classification model in this embodiment of the application may select a front 2-layer bottleneck layer (bottleeck layer) of a GhostNet network as a backbone network of the layer classification model, and extract a feature map of the input image by using a second bottleneck layer as a base layer.

And S330, pooling the feature map by using a void space pyramid network to obtain a multi-scale feature map.

For example, as shown in fig. 4, the void space pyramid network may include 4 layers (4 convolution layers), the number of channels of the 4 layers may be 256, 512, 1024, 2048, respectively, the step size of the 4 layers may be 3 × 3, and the void convolution ratios of the 4 layers may be set to 8, 12, 16, 32, respectively.

And S340, performing feature enhancement on the multi-scale feature map based on the context information of each layer by using a transformer model to obtain a target feature map.

For example, as shown in fig. 4, the multi-scale feature map (obtained in S330) may be first spliced (concat), then convolved with a step size of 1 × 1 and a number of channels of 256, and then the feature enhancement is performed on the result (after concat and convolution processing) by using a transformer model, so as to obtain the target feature map.

And S350, using a classifier to classify the layers based on the target feature map to obtain the classification result of each layer.

For example, as shown in fig. 4, the target feature map may be concat first, then processed by using a softmax activation function, and then subjected to layer classification by using a classifier, so as to obtain a final classification result (i.e., a classification result of each layer in the input image).

Fig. 5 is a schematic block diagram of an apparatus 500 for layer classification according to an embodiment of the present application. It should be understood that the apparatus 500 shown in fig. 5 is only an example, and the apparatus 500 of the embodiments of the present application may further include other modules or units. It should be understood that the apparatus 500 is capable of performing the various steps in the method of fig. 2 and, to avoid repetition, will not be described in detail herein.

An obtaining unit 510, configured to obtain a feature map of an input image, where the input image is a layer image;

a first determining unit 520, configured to determine a multi-scale feature map of each layer in the input image according to the feature map;

a feature enhancing unit 530, configured to perform feature enhancement on the multi-scale feature map based on the context information of each map layer to obtain a target feature map;

and the layer classification unit 540 is configured to perform layer classification based on the target feature map to obtain a classification result of each layer.

Optionally, the apparatus 500 further includes a second determining unit 550, configured to determine, by a morphological method, a boundary position of each primitive in each layer; determining a thumbnail of the input image based on the boundary position of each primitive; the obtaining unit 510 is specifically configured to determine the feature map based on the thumbnail.

Optionally, the obtaining unit 510 is specifically configured to extract a feature map of the input image by using a lightweight network.

Optionally, the lightweight network consists of point convolution, depth separable convolution and full pooling.

Optionally, the first determining unit 520 is specifically configured to perform pooling processing on the feature map by using a void space pyramid network to obtain the multi-scale feature map.

Optionally, the feature enhancing unit 530 is specifically configured to perform feature enhancement on the multi-scale feature map based on the context information of each image layer by using a deep learning model, so as to obtain the target feature map, where the deep learning model is composed of an encoder, a decoder, and an attention mechanism.

It should be appreciated that the apparatus 500 herein is embodied in the form of functional modules. The term "module" herein may be implemented in software and/or hardware, and is not particularly limited thereto. For example, a "module" may be a software program, a hardware circuit, or a combination of both that implements the functionality described above. The hardware circuitry may include an Application Specific Integrated Circuit (ASIC), an electronic circuit, a processor (e.g., a shared processor, a dedicated processor, or a group of processors) and memory that execute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that support the described functionality.

As an example, the apparatus 500 provided in the embodiment of the present application may be a processor or a chip, and is configured to perform the method described in the embodiment of the present application.

Fig. 6 is a schematic block diagram of an apparatus 400 for layer classification according to an embodiment of the present application. The apparatus 400 shown in fig. 6 includes a memory 401, a processor 402, a communication interface 403, and a bus 404. The memory 401, the processor 402 and the communication interface 403 are connected to each other by a bus 404.

The memory 401 may be a Read Only Memory (ROM), a static memory device, a dynamic memory device, or a Random Access Memory (RAM). The memory 401 may store a program, and when the program stored in the memory 401 is executed by the processor 402, the processor 402 is configured to perform the steps of the method according to the embodiment of the present application, for example, the steps of the embodiments shown in fig. 2 and 3 may be performed.

The processor 402 may be a general-purpose Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits, and is configured to execute related programs to implement the methods of the embodiments of the present application.

The processor 402 may also be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the method of the embodiment of the present application may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 402.

The processor 402 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 401, and the processor 402 reads information in the memory 401, and performs, in combination with hardware thereof, functions that need to be performed by units included in the apparatus for layer classification in the embodiment of the present application, or performs the method in the embodiment of the method in the present application, for example, may perform each step/function in the embodiments shown in fig. 2 and fig. 3.

The communication interface 403 may use transceiver means, such as, but not limited to, a transceiver, to enable communication between the apparatus 400 and other devices or communication networks.

Bus 404 may include a path that transfers information between various components of apparatus 400 (e.g., memory 401, processor 402, communication interface 403).

It should be understood that the apparatus 400 shown in the embodiments of the present application may be a processor or a chip for performing the methods described in the embodiments of the present application.

It should be understood that in the embodiments of the present application, the processor may be a Central Processing Unit (CPU), and the processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It should be understood that in the embodiment of the present application, "B corresponding to a" means that B is associated with a, from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may be determined from a and/or other information.

It should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be read by a computer or a data storage device including one or more available media integrated servers, data centers, and the like. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a Digital Versatile Disk (DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), among others.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for layer classification, comprising:

acquiring a characteristic diagram of an input image, wherein the input image is a building image;

determining a multi-scale feature map of each layer in the input image according to the feature map;

performing feature enhancement on the multi-scale feature map based on the context information of each map layer to obtain a target feature map;

and carrying out layer classification based on the target characteristic graph to obtain a classification result of each layer.

2. The method of claim 1, wherein prior to said obtaining the feature map of the input image, the method further comprises:

determining the boundary position of each graphic element in each layer by a morphological method;

determining a thumbnail of the input image based on the boundary position of each primitive;

the acquiring of the feature map of the input image includes:

determining the feature map based on the thumbnail.

3. The method of claim 2, wherein the obtaining the feature map of the input image comprises:

and extracting a feature map of the input image by using a lightweight network.

4. The method of claim 3, wherein the lightweight network consists of point convolution, depth separable convolution, and full pooling.

5. The method according to claim 1, wherein the determining the multi-scale feature map of each layer in the input image according to the feature map comprises:

and performing pooling treatment on the feature map by using a void space pyramid network to obtain the multi-scale feature map.

6. The method according to claim 1, wherein the performing feature enhancement on the multi-scale feature map based on the context information of each image layer to obtain a target feature map comprises:

and performing feature enhancement on the multi-scale feature map based on the context information of each map layer by using a deep learning model to obtain the target feature map, wherein the deep learning model is composed of an encoder, a decoder and an attention mechanism.

7. An apparatus for layer classification, comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a feature map of an input image, and the input image is a layer image;

the determining unit is used for determining the multi-scale feature map of each layer in the input image according to the feature map;

the characteristic enhancement unit is used for carrying out characteristic enhancement on the multi-scale characteristic graph based on the context information of each layer to obtain a target characteristic graph;

and the layer classification unit is used for carrying out layer classification based on the target characteristic graph to obtain a classification result of each layer.

8. An apparatus for layer classification, comprising a processor and a memory, the memory configured to store program instructions, the processor configured to invoke the program instructions to perform the method of any of claims 1 to 6.

9. A computer readable storage medium comprising computer instructions which, when executed on a computer, cause the computer to perform the method of any of claims 1 to 6.