RU2785821C1

RU2785821C1 - Method for detecting objects in the image of the plan-scheme of the construction object

Info

Publication number: RU2785821C1
Application number: RU2022129480A
Authority: RU
Inventors: Ольга Дмитриевна Миронова; Елизавета Антоновна Крапивина; Павел Александрович Бахметов; Сергей Сергеевич Маркин; Никита Павлович Шиварев; Дмитрий Андреевич Аксенов
Original assignee: Ольга Дмитриевна Миронова
Filing date: 2022-11-14
Publication date: 2022-12-14

Abstract

FIELD: invention relates to the field of means for identifying objects on the plan-scheme of a construction site.

SUBSTANCE: proposed method includes the identification of conditional images of construction elements on images of real estate, which is carried out using convolutional neural networks, classifying the identified objects into single-level and multi-level. Further produce comparison, taking into account the level mark, among themselves of objects located at adjacent levels and classified as multi-level, combine objects located at adjacent levels, having the same location of the conditional image of the “stairs” construction element, into one object.

EFFECT: technical result is to provide the possibility of identifying multi-level objects on the plan-scheme of the construction object.

5 cl

Description

Область техникиTechnical field

Данное изобретение относится к средствам для выявления объектов на плане-схеме объекта строительства на основании определения наличия элементов условных обозначений и может быть использовано, в частности, при анализе различных строительных объектов.This invention relates to means for identifying objects on the plan-scheme of a construction site based on determining the presence of symbol elements and can be used, in particular, in the analysis of various construction projects.

Уровень техникиState of the art

Из уровня техники известна заявка CN 108764022 А, опубликованная 06.11.2018. В данной заявке раскрыт способ распознавания изображений, в частности, планов зданий, который предполагает использование обученной нейронной сети для создания классификатора изображений. После ввода изображения в классификатор оно маркируется соответствующим классом. Все условные классы могут быть проклассифицированы по четырем категориям. На выходе указанного классификатора формируются оценки достоверности отнесения того или иного условного обозначения к определенному классу. Если показатель достоверности превышает предварительно установленный порог достоверности, делают вывод, что распознаваемое обозначение относится к присвоенному ему классу. Данное решение направлено на повышение эффективности обработки изображений условных обозначений на планах зданий.From the prior art, the application CN 108764022 A, published on 11/06/2018, is known. This application discloses an image recognition method, in particular building plans, which involves the use of a trained neural network to create an image classifier. After entering the image into the classifier, it is marked with the corresponding class. All conditional classes can be classified into four categories. At the output of the specified classifier, estimates of the reliability of referring this or that symbol to a certain class are formed. If the confidence index exceeds a predetermined confidence threshold, it is concluded that the recognized designation belongs to the class assigned to it. This solution is aimed at improving the efficiency of processing symbols images on building plans.

Также из уровня техники известна заявка US 2019/0243928 А1, опубликованная 08.08.2019, в которой раскрыт способ семантической сегментация 2d-планов зданий с помощью пиксельного классификатора. Указанный способ предполагает семантическую сегментацию изображения плана-схемы помещения с ее классификацией на основе отдельных пикселей изображения с помощью обученной нейронной сети. Семантическая сегментация обеспечивает возможность выявить в изображении несколько семантически значимых частей и классифицировать каждую из таких частей по одной из заранее определенных категорий. Классификация осуществляется для набора классов, включающего два класса стен, дверей и окон. Классификаторы содержат распределение вероятностей по набору классов, при этом каждому пикселю ставится в соответствие определенное вероятностное число. Описанный выше способ обеспечивает повышение точности обработки двумерных план-схем зданий по сравнению с аналогичными ему предшествующими решениями.Also known from the prior art is US 2019/0243928 A1, published on August 8, 2019, which discloses a method for semantic segmentation of 2d building plans using a pixel classifier. This method involves the semantic segmentation of the image of the layout of the room with its classification based on the individual pixels of the image using a trained neural network. Semantic segmentation provides the ability to identify several semantically significant parts in an image and classify each of these parts according to one of predetermined categories. Classification is carried out for a set of classes, including two classes of walls, doors and windows. The classifiers contain a probability distribution over a set of classes, with each pixel assigned a certain probability number. The method described above provides an increase in the accuracy of processing two-dimensional building plans in comparison with similar previous solutions.

Наиболее близким аналогом заявленного изобретения является способ автоматизированной идентификации и использования информации о плане этажа здания, раскрытый в заявке US 2022/0092227 А1, опубликованной 24.03.2022. Указанный способ раскрывает выполнение операций по идентификации план-схем зданий, имеющих объекты, которые представляют интерес для пользователя. В процессе осуществления данного способа определяют взаимное расположение объектов, например, комнат, стен, окон зданий, и строят соответствующий граф. Идентификацию данных объектов на план-схемах зданий осуществляют с помощью обученной нейронной сети. После анализа план-схем и идентификации имеющихся объектов, определяют те из них которые соответствуют заданным критериям, например, квартиры, имеющие заданное количество комнат с определенными диапазонами размеров. Благодаря использованию графов, описывающих взаимное расположение объектов, обеспечивается поиск по таким критериям, как связь между комнатами. Описанное выше изобретение позволяет обеспечить быструю и эффективную идентификацию объектов на исследуемой план-схеме здания.The closest analogue of the claimed invention is a method for automated identification and use of information about the floor plan of a building, disclosed in the application US 2022/0092227 A1, published on 03/24/2022. Said method discloses performing operations for identifying floor plans of buildings having objects that are of interest to the user. In the process of implementing this method, the relative position of objects is determined, for example, rooms, walls, windows of buildings, and a corresponding graph is built. The identification of these objects on the building plans is carried out using a trained neural network. After analyzing the plans and identifying the available objects, those that meet the given criteria are determined, for example, apartments that have a given number of rooms with certain size ranges. Through the use of graphs that describe the relative position of objects, a search is provided by criteria such as the relationship between rooms. The invention described above makes it possible to provide fast and efficient identification of objects on the building plan under study.

Основным недостатком описанных выше аналогов заявленного изобретения является невозможность идентификации и выявления многоуровневых объектов на план-схеме объекта строительства.The main disadvantage of the analogues of the claimed invention described above is the impossibility of identifying and identifying multi-level objects on the plan-scheme of the construction object.

Предложенное изобретение направлено на устранение указанного недостатка.The proposed invention is aimed at eliminating this disadvantage.

В качестве технического результата, достигаемого при осуществлении изобретения, выступает возможность выявления многоуровневых объектов на план-схеме объекта строительства.As a technical result achieved in the implementation of the invention, there is the possibility of identifying multi-level objects on the plan-scheme of the construction object.

Сущность изобретенияThe essence of the invention

Указанный технический результат и поставленная задача достигается / решается за счет того, что способ выявления объектов недвижимости на изображении многоуровневого плана-схемы объекта строительства, включающий в себя:The specified technical result and the task set are achieved / solved due to the fact that the method for identifying real estate objects in the image of a multi-level plan-scheme of a construction object, including:

загрузку изображения многоуровневого плана-схемы объекта строительства;loading an image of a multi-level plan-scheme of a construction object;

присвоение каждому уровню плана-схемы объекта строительства метки уровня,assignment of level labels to each level of the plan-scheme of the construction object,

выявление с помощью сверточной нейронной сети первого типа на изображениях уровней объектов недвижимости с присвоением им метки уровня,identification using a convolutional neural network of the first type on images of levels of real estate objects with assignment of a level label to them,

с помощью сверточной нейронной сети второго типа распознавание на изображениях выявленных объектов недвижимости условных изображений элементов строительства и классификация выявленных объектов на многоуровневый и одноуровневые, при этом при выявлении условного изображения элемента строительства «лестница» объект классифицируют как многоуровневый,using a convolutional neural network of the second type, recognition of conditional images of construction elements on images of identified real estate objects and classification of identified objects into multi-level and single-level, while identifying a conditional image of a construction element “stairs”, the object is classified as multi-level,

с учетом метки уровня сопоставление между собой объектов, расположенных на смежных уровнях и классифицированных как многоуровневые,taking into account the level mark, comparison between objects located at adjacent levels and classified as multilevel,

объединение объектов, расположенных на смежных уровнях, имеющих совпадающее расположение условного изображения элемента строительства «лестница», в один объект,combining objects located at adjacent levels, having the same location of the conditional image of the “staircase” construction element, into one object,

выявление и подсчет одинаковых условных изображений элементов строительства, соответствующим каждому объекту, а также определение их расположения друг относительно друга;identifying and counting identical conditional images of construction elements corresponding to each object, as well as determining their location relative to each other;

задание параметров фильтрации объектов;setting object filtering parameters;

на основании полученных данных о выявленных условных изображениях элементов строительства, об их количестве, а также о расположении их относительно друг друга, выявление на многоуровневой плане-схемы объекта строительства объектов, соответствующих параметрам фильтрации,based on the data obtained on the identified conditional images of construction elements, their number, as well as their location relative to each other, identifying objects corresponding to the filtering parameters on a multi-level plan-scheme of the construction object,

при этом нейронная сеть первого типа обучена на наборе массивов данных, который включает в себя изображения план-схем объектов недвижимости,at the same time, the neural network of the first type is trained on a set of data arrays, which includes images of floor plans of real estate objects,

нейронная сеть второго типа обучена на наборе массивов данных, который включает в себя условные изображения элементов строительства, и наборе массивов данных, который включает в себя изображения план-схем одно/многоуровневых объектов недвижимости.the neural network of the second type is trained on a set of data arrays, which includes conditional images of building elements, and a set of data arrays, which includes images of floor plans of single/multi-level real estate objects.

В одном из вариантов изображение многоуровневого плана-схемы объекта строительства может быть в цветном формате RGB.In one of the options, the image of the multi-level plan-scheme of the construction site can be in RGB color format.

В одном из вариантов осуществляется преобразование цветного изображения в черно-белое, которое включает в себя два этапа:In one of the options, the conversion of a color image to black and white is carried out, which includes two stages:

преобразование цветного изображения в изображение в градациях серого;converting a color image to a grayscale image;

преобразование изображения в градациях серого в черно-белое изображение.converting a grayscale image to a black and white image.

В одном из вариантов изображение уровня многоуровневого плана-схемы объекта строительства представляет собой изображение этажа дома.In one of the options, the image of the level of the multi-level plan-scheme of the construction object is an image of the floor of the house.

В одном из вариантов объект недвижимости представляет собой квартиру или офис.In one embodiment, the property is an apartment or office.

Выявление с помощью сверточной нейронной сети первого типа на изображениях уровней объектов недвижимости с присвоением им метки уровня, распознавание с помощью сверточной нейронной сети второго типа на изображениях выявленных объектов недвижимости условных изображений элементов строительства и классификация выявленных объектов на многоуровневые и одноуровневые, сопоставление с учетом метки уровня между собой объектов, расположенных на смежных уровнях и классифицированных как многоуровневые, объединение объектов, расположенных на смежных уровнях, имеющих совпадающее расположение условного изображения элемента строительства «лестница», в один объект, обеспечивает возможность выявления многоуровневых объектов на план-схеме объекта строительства.Identification using a convolutional neural network of the first type on images of levels of real estate objects with assignment of a level label to them, recognition using a convolutional neural network of the second type on images of identified real estate objects of conditional images of construction elements and classification of identified objects into multi-level and single-level, comparison taking into account the level label between objects located at adjacent levels and classified as multi-level, the combination of objects located at adjacent levels, having the same location of the symbolic image of the construction element "ladder", into one object, provides the ability to identify multi-level objects on the plan of the construction object.

Осуществление изобретенияImplementation of the invention

Варианты реализации изобретения устраняют вышеупомянутые и другие недостатки, обеспечивая механизмы для выявления многоуровневых объектов на план-схеме объекта строительства.Embodiments of the invention address the above and other disadvantages by providing mechanisms for identifying multi-level objects on a construction site plan.

Настоящее раскрытие описывает способ выявления объектов недвижимости на изображении многоуровневого плана-схемы объекта строительства.The present disclosure describes a method for identifying real estate objects in a layered plan image of a construction site.

На первом этапе осуществляют загрузку изображения многоуровневого плана-схемы объекта строительства. Изображение многоуровневого плана-схемы объекта строительства может быть представлено в цветном формате RGB. Изображение, представленное в формате RGB, может быть преобразовано из цветного изображения в черно-белое. Указанное преобразование может включать в себя два этапа:At the first stage, an image of a multi-level plan-scheme of the construction object is loaded. An image of a multi-level plan-scheme of a construction site can be presented in RGB color format. An image presented in RGB format can be converted from a color image to black and white. This transformation may include two steps:

преобразование цветного изображения в изображение в градациях серого; преобразование изображения в градациях серого в черно-белое изображение.converting a color image to a grayscale image; converting a grayscale image to a black and white image.

Каждому изображению уровня многоуровневого плана-схемы объекта строительства присваивается метка уровня.Each level image of a multi-level plan-scheme of a construction object is assigned a level label.

Изображение уровня многоуровневого плана-схемы объекта строительства может представлять собой изображение этажа дома.The image of the level of a multi-level plan-scheme of the construction object may be an image of the floor of the house.

Далее с помощью сверточной нейронной сети первого типа на изображениях уровней выявляются объекты недвижимости. Выявленным объектам присваиваются метки уровня, показывающие на каком именно уровне (этаже) расположен выявленный объект.Further, using a convolutional neural network of the first type, real estate objects are identified on the level images. Detected objects are assigned level labels showing on which level (floor) the detected object is located.

С помощью сверточной нейронной сети второго типа осуществляется распознавание на изображениях выявленных объектов недвижимости условных изображений элементов строительства и классификация выявленных объектов на многоуровневые и одноуровневые.With the help of a convolutional neural network of the second type, conditional images of construction elements are recognized on the images of identified real estate objects and the identified objects are classified into multi-level and single-level.

Объект классифицируют как многоуровневый при выявлении на изображении объекта недвижимости условного изображения элемента строительства «лестница».The object is classified as multi-level when a conditional image of the construction element "stairs" is identified on the image of the real estate object.

Многоуровневые объекты, расположенных на смежных уровнях, сопоставляются между собой с учетом метки уровня. Объекты, расположенных на смежных уровнях, имеющие совпадающее расположение условного изображения элемента строительства «лестница», объединяются в один объект.Multilevel objects located on adjacent levels are compared with each other, taking into account the level label. Objects located at adjacent levels, having the same location of the conditional image of the "stair" construction element, are combined into one object.

Далее осуществляется выявление и подсчет одинаковых условных изображений элементов строительства, соответствующим каждому объекту, а также определение их расположения друг относительно друга.Next, the identification and calculation of identical conditional images of construction elements corresponding to each object is carried out, as well as determining their location relative to each other.

После задания параметров фильтрации объектов на основании полученных данных о выявленных условных изображениях элементов строительства, об их количестве, а также о расположении их относительно друг друга, выявляют на многоуровневой плане-схемы объекта строительства объекты, соответствующие параметрам фильтрации.After setting the parameters for filtering objects, based on the data obtained on the identified conditional images of construction elements, their number, as well as their location relative to each other, the objects corresponding to the filtering parameters are identified on the multilevel plan-scheme of the construction object.

Условные изображения элементов строительства могут, в частности, соответствовать дверным проемам между комнатами, наружным дверным проемам, окнам, лестницам и другие.Conditional images of building elements can, in particular, correspond to doorways between rooms, external doorways, windows, stairs, and others.

Под параметрами фильтрации могут, в частности, пониматься параметры, указывающие на вид элемента строительства (например, окно, дверь, или другое), количество этих элементов, расположении их относительно друг друга и другие.Filtration parameters can, in particular, be understood as parameters indicating the type of construction element (for example, a window, door, or other), the number of these elements, their location relative to each other, and others.

В некоторых вариантах осуществления заявленный способ может выполняться на одной или нескольких вычислительных системах, связанных сетью. Сетью может являться, например, сеть Интернет или частная сетью. Кроме того, сеть может включать в себя различные типы проводных и/или беспроводных сетей.In some embodiments, the implementation of the claimed method can be performed on one or more computing systems connected by a network. The network may be, for example, the Internet or a private network. In addition, the network may include various types of wired and/or wireless networks.

Вычислительные системы могут включать в себя различные аппаратные компоненты, такие как процессоры и запоминающие устройства.Computing systems may include various hardware components such as processors and storage devices.

Информация об изображении плана-схемы объекта строительства и/или дополнительная связанная информация может быть получена от одного или нескольких внешних источников.Information about the image plan-scheme of the construction site and/or additional related information may be obtained from one or more external sources.

Используемая сверточная нейронная сеть является однонаправленной (без обратных связей) многослойной нейронной сетью с чередованием сверточных слоев (англ. convolution layers) и субдискретизирующих слоев (англ. subsampling layers или англ. pooling layers, слоев подвыборки).The convolutional neural network used is a unidirectional (without feedback) multilayer neural network with alternating convolution layers and subsampling layers (subsampling layers or pooling layers, subsampling layers).

Использование сверточных нейронных сетей обусловлено тем, что они обеспечивают частичную устойчивость к изменениям масштаба, смещениям, поворотам, смене ракурса и прочим искажениям. Сверточные нейронные сети объединяют три архитектурных идеи, для обеспечения инвариантности к изменению масштаба, повороту сдвигу и пространственным искажениям:The use of convolutional neural networks is due to the fact that they provide partial resistance to scale changes, displacements, rotations, angle changes and other distortions. Convolutional neural networks combine three architectural ideas to provide invariance to scale, rotation, shift, and spatial distortion:

- локальные рецепторные поля (обеспечивают локальную двумерную связность нейронов);- local receptor fields (provide local two-dimensional connectivity of neurons);

- общие синаптические коэффициенты (обеспечивают детектирование некоторых черт в любом месте изображения и уменьшают общее число весовых коэффициентов);- general synaptic coefficients (ensure the detection of some features anywhere in the image and reduce the total number of weight coefficients);

- иерархическая организация с пространственными подвыборками.- hierarchical organization with spatial subsamples.

На данный момент сверточная нейронная сеть и ее модификации считаются лучшими по точности и скорости алгоритмами нахождения объектов на сцене.At the moment, the convolutional neural network and its modifications are considered the best algorithms for finding objects in the scene in terms of accuracy and speed.

В обычном перцептроне, который представляет собой полносвязную нейронную сеть, каждый нейрон связан со всеми нейронами предыдущего слоя, причем каждая связь имеет свой персональный весовой коэффициент. В сверточной нейронной сети в операции свертки используется лишь ограниченная матрица весов небольшого размера, которую «двигают» по всему обрабатываемому слою (в самом начале - непосредственно по входному изображению), формируя после каждого сдвига сигнал активации для нейрона следующего слоя с аналогичной позицией. То есть для различных нейронов выходного слоя используются одна и та же матрица весов, которую также называют ядром свертки. Ее интерпретируют как графическое кодирование какого-либо признака, например, наличие наклонной линии под определенным углом. Тогда следующий слой, получившийся в результате операции свертки такой матрицей весов, показывает наличие данного признака в обрабатываемом слое и ее координаты, формируя так называемую карту признаков (англ. feature map). Естественно, в сверточной нейронной сети набор весов не один, а целая гамма, кодирующая элементы изображения (например линии и дуги под разными углами). При этом такие ядра свертки не закладываются исследователем заранее, а формируются самостоятельно путем обучения сети классическим методом обратного распространения ошибки. Проход каждым набором весов формирует свой собственный экземпляр карты признаков, делая нейронную сеть многоканальной (много независимых карт признаков на одном слое). Также следует отметить, что при переборе слоя матрицей весов ее передвигают обычно не на полный шаг (размер этой матрицы), а на небольшое расстояние. Так, например, при размерности матрицы весов 5×5 ее сдвигают на один или два нейрона (пикселя) вместо пяти, чтобы не «перешагнуть» искомый признак.In an ordinary perceptron, which is a fully connected neural network, each neuron is connected to all the neurons of the previous layer, and each connection has its own personal weight coefficient. In a convolutional neural network, the convolution operation uses only a limited weight matrix of a small size, which is “moved” over the entire processed layer (at the very beginning - directly over the input image), forming after each shift an activation signal for the neuron of the next layer with a similar position. That is, for different neurons of the output layer, the same weight matrix is used, which is also called the convolution kernel. It is interpreted as a graphic encoding of some feature, for example, the presence of an oblique line at a certain angle. Then the next layer, resulting from the convolution operation by such a weight matrix, shows the presence of this feature in the processed layer and its coordinates, forming the so-called feature map. Naturally, in a convolutional neural network, the set of weights is not one, but a whole range that encodes image elements (for example, lines and arcs at different angles). At the same time, such convolution kernels are not laid down by the researcher in advance, but are formed independently by training the network using the classical method of error backpropagation. Passing each set of weights generates its own feature map instance, making the neural network multichannel (many independent feature maps on one layer). It should also be noted that when iterating over a layer by the weight matrix, it is usually moved not by a full step (the size of this matrix), but by a small distance. So, for example, when the dimension of the weight matrix is 5 × 5, it is shifted by one or two neurons (pixels) instead of five, so as not to “step over” the desired feature.

Операция субдискретизации (англ. subsampling, англ. pooling, также переводимая как «операция подвыборки» или операция объединения), выполняет уменьшение размерности сформированных карт признаков. В данной архитектуре сети считается, что информация о факте наличия искомого признака важнее точного знания его координат, поэтому из нескольких соседних нейронов карты признаков выбирается максимальный и принимается за один нейрон уплотненной карты признаков меньшей размерности. За счет данной операции, помимо ускорения дальнейших вычислений, сеть становится более инвариантной к масштабу входного изображения.The subsampling operation (eng. subsampling, eng. pooling, also translated as “subsampling operation” or pooling operation) performs a reduction in the dimension of the generated feature maps. In this network architecture, it is considered that information about the fact of the presence of the desired feature is more important than the exact knowledge of its coordinates, therefore, from several neighboring neurons of the feature map, the maximum one is selected and taken as one neuron of the compacted feature map of a smaller dimension. Due to this operation, in addition to speeding up further calculations, the network becomes more invariant to the scale of the input image.

Сверточная нейронная сеть состоит из большого количества слоев. После начального слоя (входного изображения) сигнал проходит серию сверточных слоев, в которых чередуется собственно свертка и субдискретизация (пулинг).A convolutional neural network consists of a large number of layers. After the initial layer (input image), the signal passes through a series of convolutional layers, in which the actual convolution and subsampling (pooling) alternate.

Чередование слоев позволяет составлять «карты признаков» из карт признаков, на каждом следующем слое карта уменьшается в размере, но увеличивается количество каналов. На практике это означает способность распознавания сложных иерархий признаков. Обычно после прохождения нескольких слоев карта признаков вырождается в вектор или даже скаляр, но таких карт признаков становятся сотни. На выходе сверточных слоев сети дополнительно устанавливают несколько слоев полносвязной нейронной сети (перцептрон), на вход которому подаются оконечные карты признаков.Alternating layers allows you to make "feature maps" from feature maps, on each next layer the map decreases in size, but the number of channels increases. In practice, this means the ability to recognize complex feature hierarchies. Usually, after passing through several layers, the feature map degenerates into a vector or even a scalar, but there are hundreds of such feature maps. At the output of the convolutional layers of the network, several layers of a fully connected neural network (perceptron) are additionally installed, at the input of which end feature maps are fed.

Слой свертки (англ. convolutional layer) - это основной блок сверточной нейронной сети. Слой свертки включает в себя для каждого канала свой фильтр, ядро свертки которого обрабатывает предыдущий слой по фрагментам (суммируя результаты поэлементного произведения для каждого фрагмента). Весовые коэффициенты ядра свертки (небольшой матрицы) неизвестны и устанавливаются в процессе обучения.The convolutional layer is the main building block of a convolutional neural network. The convolution layer includes its own filter for each channel, the convolution kernel of which processes the previous layer by fragments (summing up the results of the element-wise product for each fragment). The weights of the convolution kernel (small matrix) are unknown and are set during training.

Особенностью сверточного слоя является сравнительно небольшое количество параметров, устанавливаемое при обучении. Так, например, если исходное изображение имеет размерность 100×100 пикселей по трем каналам (это значит 30 000 входных нейронов), а сверточный слой использует фильтры с ядром 3×3 пикселя с выходом на 6 каналов, тогда в процессе обучения определяется только 9 весов ядра, однако по всем сочетаниям каналов, то есть 9×3×6=162, в таком случае данный слой требует нахождения только 162 параметров, что существенно меньше количества искомых параметров полносвязной нейронной сети.A feature of the convolutional layer is a relatively small number of parameters that are set during training. So, for example, if the original image has a dimension of 100 × 100 pixels in three channels (which means 30,000 input neurons), and the convolutional layer uses filters with a 3 × 3 pixel kernel with an output of 6 channels, then only 9 weights are determined in the learning process kernel, however, for all combinations of channels, that is, 9 × 3 × 6 = 162, in this case, this layer requires finding only 162 parameters, which is significantly less than the number of required parameters of a fully connected neural network.

Слой пулинга (иначе подвыборки, субдискретизации) представляет собой нелинейное уплотнение карты признаков, при этом группа пикселей (обычно размера 2×2) уплотняется до одного пикселя, проходя нелинейное преобразование. Наиболее употребительна при этом функция максимума. Преобразования затрагивают непересекающиеся прямоугольники или квадраты, каждый из которых ужимается в один пиксель, при этом выбирается пиксель, имеющий максимальное значение. Операция пулинга позволяет существенно уменьшить пространственный объем изображения. Пулинг интерпретируется так: если на предыдущей операции свертки уже были выявлены некоторые признаки, то для дальнейшей обработки настолько подробное изображение уже не нужно, и оно уплотняется до менее подробного. К тому же фильтрация уже ненужных деталей помогает не переобучаться. Слой пулинга, как правило, вставляется после слоя свертки перед слоем следующей свертки.The pooling layer (otherwise subsampling, subsampling) is a non-linear compaction of a feature map, while a group of pixels (usually 2 × 2 in size) is compacted to one pixel, undergoing a non-linear transformation. In this case, the maximum function is most commonly used. The transformations affect non-overlapping rectangles or squares, each of which is compressed into one pixel, and the pixel with the maximum value is selected. The pooling operation can significantly reduce the spatial volume of the image. Pooling is interpreted as follows: if some features were already identified during the previous convolution operation, then such a detailed image is no longer needed for further processing, and it is compacted to a less detailed one. In addition, filtering out unnecessary details helps not to retrain. The pooling layer is typically inserted after the convolution layer, before the next convolution layer.

После нескольких прохождений свертки изображения и уплотнения с помощью пулинга система перестраивается от конкретной сетки пикселей с высоким разрешением к более абстрактным картам признаков, как правило, на каждом следующем слое увеличивается число каналов и уменьшается размерность изображения в каждом канале. В конце концов, остается большой набор каналов, хранящих небольшое число данных (даже один параметр), которые интерпретируются как самые абстрактные понятия, выявленные из исходного изображения.After several passes of image convolution and pooling compression, the system is rebuilt from a specific grid of high-resolution pixels to more abstract feature maps, as a rule, on each subsequent layer, the number of channels increases and the image dimension decreases in each channel. In the end, what remains is a large set of channels that store a small amount of data (even a single parameter) that are interpreted as the most abstract concepts revealed from the original image.

Эти данные объединяются и передаются на обычную полносвязную нейронную сеть, которая тоже может состоять из нескольких слоев. При этом полносвязные слои уже утрачивают пространственную структуру пикселей и обладают сравнительно небольшой размерностью (по отношению к количеству пикселей исходного изображения).This data is combined and transmitted to a conventional fully connected neural network, which can also consist of several layers. At the same time, fully connected layers already lose the spatial structure of pixels and have a relatively small dimension (in relation to the number of pixels in the original image).

Используемым способом обучения может является метод обучения с учителем (на маркированных данных) - метод обратного распространения ошибки и его модификации.The learning method used can be the method of learning with a teacher (on labeled data) - the method of back propagation of an error and its modifications.

Сверточная нейронная сеть первого типа используется для выявления на общих план-схемах уровней объектов строительства объектов, относящихся к объекту недвижимости. Для решения этой задачи сверточная нейронная сеть первого типа обучается на наборах данных, которые включают в себя изображения план-схем объектов недвижимости, которые являются положительными примерами, и изображения план-схем объектов общего пользования, которые являются отрицательными примерами. Соотношение положительных к отрицательным примерам 4 к 1, 8000 положительных и 2000 отрицательных.A convolutional neural network of the first type is used to identify the levels of construction objects of objects related to a real estate object on general plan diagrams. To solve this problem, a first type convolutional neural network is trained on datasets that include images of floor plans of real estate objects, which are positive examples, and images of floor plans of common areas, which are negative examples. The ratio of positive to negative examples is 4 to 1, 8000 positive and 2000 negative.

После обработки входных данных, содержащих общую план-схему уровня объекта строительства с меткой уровня, сверточной нейронной сетью первого типа, на выходе будет получен набор план-схем отдельных объектов недвижимости, который является входным набором данных для сверточной нейронной сети второго типа.After processing the input data containing the general plan-scheme of the level of the construction object with the level label, the convolutional neural network of the first type, the output will be a set of plans-schemes of individual real estate objects, which is the input data set for the convolutional neural network of the second type.

Сверточная нейронная сеть второго типа может быть обучена на наборе массивов данных, который включает в себя условные изображения элементов строительства, таких как, например, окна, двери, стены, и наборе массивов данных, который включает в себя изображения план-схем одно/многоуровневых объектов недвижимости.A convolutional neural network of the second type can be trained on a dataset that includes conditional images of building elements, such as windows, doors, walls, and a dataset that includes images of layouts of single / multi-level objects. real estate.

Описанные выше этапы способа обеспечивают возможность выявления многоуровневых объектов на плане-схеме объекта строительства.The steps of the method described above make it possible to identify multi-level objects on the plan-scheme of the construction object.

Claims

1. A method for detecting images of real estate objects on an image of a multi-level diagram of a construction object, including:

loading an image of a multi-level plan-scheme of a construction object;

assignment of level labels to each level of the plan-scheme of the construction object,

identification using a convolutional neural network of the first type on images of levels of real estate objects with assignment of a level label to them,

using a convolutional neural network of the second type, recognition of conditional images of construction elements on images of identified real estate objects and classification of identified objects into multi-level and single-level, while identifying a conditional image of a construction element "stairs" classifies the object as multi-level,

taking into account the level mark, comparison between objects located at adjacent levels and classified as multilevel,

combining objects located at adjacent levels, having the same location of the conditional image of the “staircase” construction element, into one object,

identification and calculation of identical conditional images of construction elements corresponding to each object, as well as determining their location relative to each other;

setting object filtering parameters;

based on the data obtained on the identified conditional images of construction elements, their number, as well as their location relative to each other, identifying objects corresponding to the filtering parameters on a multi-level plan-scheme of the construction object,

at the same time, the neural network of the first type is trained on a set of data arrays, which includes images of plans-schemes of real estate objects,

the neural network of the second type is trained on a set of data arrays, which includes conditional images of construction elements, and a set of data arrays, which includes images of plans-diagrams of single/multi-level real estate objects.

2. The method according to claim 1, characterized in that the image of the multi-level plan-scheme of the construction object can be in RGB color format.

3. The method according to claim 2, characterized in that the color image is converted to black and white, which includes two stages:

converting a color image to a grayscale image;

converting a grayscale image to a black and white image.

4. The method according to claim 1, in which the image of the level of the multi-level plan-scheme of the construction object is an image of the floor of the house.

5. The method according to claim 1, in which the property is an apartment or office.