CN110956196A

CN110956196A - Automatic recognition method for window-wall ratio of urban building

Info

Publication number: CN110956196A
Application number: CN201910964461.3A
Authority: CN
Inventors: 王超; 石邢; 王萌; 柳儒杨
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2019-10-11
Filing date: 2019-10-11
Publication date: 2020-04-03
Anticipated expiration: 2039-10-11
Also published as: CN110956196B

Abstract

The invention discloses an automatic identification method for window-wall ratio of an urban building. And (3) generating an XML file and importing the XML file into an improved Unet framework by shooting a sample picture meeting certain requirements and carrying out pixel level marking. And respectively carrying out recognition training on the outer wall and the window in the picture by using zooming and window scanning modes. After training is finished, a certain number of samples are selected for prediction and picture patching, and when the recognition error of more than 80% of the prediction samples is less than 10%, the model training effect is considered to be good. At the moment, a model prediction library is established according to the set parameters and popularized to window-wall ratio identification application of urban mass buildings. For wide-range picture acquisition in cities, shooting or free map websites can be relied on. And finally, inputting the pictures to be predicted into a model prediction library, and predicting to obtain the window wall ratio of each building. The method for automatically identifying the window-wall ratio of the urban building can serve for urban energy consumption simulation, and can effectively improve the speed and the precision of building an urban energy consumption model.

Description

Automatic recognition method for window-wall ratio of urban building

Technical Field

The invention belongs to the technical field of energy conservation, relates to model construction of urban energy consumption simulation, and particularly relates to an automatic identification method of a window-wall ratio of an urban building.

Background

Under a new era, the aim of energy conservation and emission reduction is fulfilled, the current energy consumption needs to be analyzed, and a clear and accurate prediction is provided for the future energy consumption demand of a city so as to make a reasonable and effective energy policy in advance. The urgent need prompts the method of urban energy consumption simulation.

The urban energy consumption simulation is a simulation method based on a physical model from bottom to top. The city energy consumption simulation is carried out by establishing a three-dimensional model of the city and giving a plurality of information including thermal parameters of the enclosure structure, system parameters of the equipment, personnel activity tables, equipment operation tables, weather information and the like to the three-dimensional model. Nouvel (2017), a German scholarer, indicates that the accuracy of urban energy consumption simulation is closely related to the accuracy of input parameters. Among the input parameters, the establishment of the urban three-dimensional model is both the basis and the key of the simulation. The accurate establishment of the urban three-dimensional model can greatly improve the accuracy of urban energy consumption simulation. American scholars Cerezo Davila (2016) noted that window-to-wall ratio information often deviates significantly from actual values in the building of three-dimensional models of cities. The window-wall ratio refers to the ratio of the total area of an external window (including a transparent curtain wall) in a certain direction to the total area of a wall surface (including the window area) in the same direction. This parameter, which generally determines the lighting and thermal insulation effects of a building, has a significant impact on the energy consumption of the building. In actual practice, since the index is not easily obtained directly, researchers usually use a random value between 0.1 and 0.8 as the window-wall ratio of the building in order to simplify the modeling. This simplification undoubtedly results in imprecision of the input parameters, which in turn affects the accuracy of the entire city energy consumption simulation.

In recent years, there has been little research on the acquisition of building window-wall ratio information. A more advanced method is to acquire building window-wall ratio information by oblique photography by an unmanned aerial vehicle. However, the technical means is expensive to implement, and flying the unmanned aerial vehicle in the city needs to be subjected to many regulations, so that large-range shooting is difficult. Therefore, the method is more difficult to acquire window-wall ratio information of massive buildings in urban dimensions. With the continuous development of computer deep learning, image recognition technology becomes more mature. Image recognition technology is currently widely used for autopilot, indoor navigation, street view analysis, and the like. The spanish scholars Garcia-Garcia (2017) in the paper indicate that a computer learns from a large number of training samples through an algorithm such as a convolutional neural network to realize semantic segmentation among different objects so as to complete image recognition. Although collection and arrangement of training samples takes a certain amount of time and labor, it is necessary to establish an effective training database considering that it is a work once and for all.

Disclosure of Invention

The invention aims to solve the problems that: the existing method for acquiring the window-wall ratio of the building is not high in accuracy or too expensive to implement, cannot be popularized to modeling of massive urban buildings, needs a cheaper, rapid, accurate and universal method for acquiring the window-wall ratio information of the urban building, and provides technical support for urban energy consumption simulation research.

In order to solve the technical problem, the invention discloses an automatic identification method of window-wall ratios of urban buildings, wherein a training library comprising a certain number of building facades is also required to be established for acquiring the window-wall ratio information of massive buildings in cities so as to guide a computer to automatically identify the window-wall ratios of other residual buildings. In fact, the number of training samples is much less than the number of buildings that need to be identified. By the method, the working efficiency can be greatly improved, and compared with a conventional random value method, the precision of urban three-dimensional modeling can be improved, so that the accuracy of urban energy consumption simulation is improved.

In order to achieve the purpose, the invention discloses an automatic identification method of a window-wall ratio of an urban building, which comprises the following steps:

the method comprises the following steps: shooting a sample building facade to obtain a sample picture;

the sample building facade shot is the initial step in building the training library. For the selection of samples, the following requirements apply:

1. the sample building should cover as many building types as possible, such as: residences, shopping malls, offices, schools, hospitals, etc.;

2. the sample building should cover as different architectural styles as possible, such as: large-area glass curtain wall office buildings and non-large-area glass curtain wall office buildings; standard rectangular facade buildings, non-rectangular facade buildings and the like;

3. the sample buildings should be represented by buildings of different ages as far as possible, such as: new and old residential districts, etc.

When shooting the building facade, the whole facade does not need to be shot completely, but the following should be ensured as far as possible:

1. shooting the vertical surface of the target at a front view angle without looking up or inclining;

2. the facade shooting is basically free of occlusion, such as: tree shelters, car shelters, etc.;

3. the building facade is shot at a short distance to prevent the window from displaying too small.

The shooting tool can select mobile equipment such as a mobile phone, a camera and the like.

Step two: labeling a sample picture;

the invention adopts pixel level labeling; the pixel-level labeling refers to labeling strictly along the outer contour of an object, and specifically refers to labeling the outlines of the building outer wall and each window in a shot picture one by one, and respectively endowing the labeled objects with labels: a "wall" or a "window". After the labeling is finished, an Extensible markup language (XML) file is exported and generated.

Step three: training a model;

1. selecting a training model framework: the basic framework model adopted by the invention is a Unet framework. The advantage of Unet, originally designed for medical image segmentation, is that: the method has the advantages of less required input characteristic parameters, suitability for prediction of small sample size and suitability for large-image photos. The invention aims to automatically identify the building window and the outer wall, and the characteristic parameters adopted for distinguishing the outer wall, the outer window and other objects are shape and color (reflected by RGB values), and the characteristic parameters are relatively less. The method is expected to predict the window-wall ratio of a mass building through a small amount of building marks, and is a small sample problem. In addition, the resolution of the shot pictures is generally more than 3000 x 3000, and the pictures are large image pictures. In summary, the Unet architecture is well suited to the needs of the present invention.

Improvement of the Unet architecture: the Unet architecture, also known as an encoder-decoder architecture, is composed of three parts, including feature extraction, upsampling, and channel; the first half is used for feature extraction (also called down-sampling, encoder part), the second half is up-sampling (decoder part), and the middle part is a channel for gradient circulation. The function of feature extraction is: the method comprises the steps of performing downsampling coding on an input picture to obtain high-resolution information of concentrated original picture information; the purpose of up-sampling is to recover picture information to obtain a required result; the role of the channel is to facilitate the model to better gather global information. The improvement of the Unet architecture is that: (1) the number of sampling layers is increased, bn layers are added, and the accuracy of feature extraction is improved; (2) the decoder calls the ResNet model parameters pre-trained in the ILSVRC match, so that the decoding precision can be improved, and the model training time can be greatly reduced.

3. And (3) outer wall identification training: in order to save the training time cost, the Unet training network requires that the resolution of the input pictures is uniform and not too large, and is preferably 250 × 250 to 600 × 600. However, in general, the resolution of the captured picture is generally higher than the required value, and a certain process is required. The common treatment methods are as follows: scaling and window scanning (window scanning means that the image is divided and then scanned one by one). Zooming can lose details of the picture, while windowing can greatly increase training time. For the outer wall, the proportion of the outer wall in the picture is large, the outer wall can be compressed into a Unet network in a scaling mode, and the detail loss is small and can be ignored. And then, performing binary recognition training, namely performing recognition learning by taking the outer wall as one class and other objects as the other classes.

4. And (3) window identification training: the window is characterized in the picture as follows: a large number, but a small area per window. If the window is scaled, too much detail will be lost, which will affect the training effect. Therefore, the picture needs to be first scanned, and the picture is divided into small blocks and then put into the Unet network. Then, binary recognition training of the window is performed for each patch.

Step four: predicting a sample picture;

1. prediction of sample picture taking requirements: in practical application, the shooting of the outer facade of the target building can not completely meet the shooting requirement of a training sample, and the problems of inclination, upward shooting and tree shielding can certainly exist. In fact, to be as close as possible to the actual shooting situation, the picture taking requirements of the prediction samples are much less stringent than during training, which allows for some tilt and tree occlusion problems for the buildings in the picture.

2. And (3) outer wall prediction: the exterior wall is predicted separately from the window. When the outer wall is predicted, the picture to be predicted is compressed to the size specified by the training picture in a scaling mode, the picture is placed into a model for prediction, and the picture is restored to the original size after a result is obtained.

3. Window prediction: when a window is predicted, a picture to be predicted is divided in a window scanning mode, and the division size is consistent with the size specified by the training picture. And then, putting all the divided pictures into a model for prediction, and splicing the pictures to the original size after obtaining a result.

4. Automatic repairing of missing elements: because trees have certain sheltering from some buildings, can lead to the incomplete of outer wall and some window, need to carry out the key element to it and repair. The invention adopts the repairing based on CRF (conditional Random field) algorithm, which corrects the predicted image by combining the shape and color information of the original image.

Step five: verifying the model precision;

after the automatic identification of the wall and the window and the automatic repair of the missing part are finished, the computer automatically counts the area of the outer wall and the window on each prediction image, and the window-wall ratio η is obtained by the calculation of the formula (1)_i：

η_i＝S_i，window/S_i，wall(1)

Wherein i represents the ith prediction picture, S_i,windowRepresents the total area of the window in the ith prediction picture, S_i,wallThe area of the outer wall in the ith prediction picture is shown. If the error of more than 80% of the prediction samples is less than 10%, the precision is considered to meet the requirement, and the step six is executed; otherwise, returning to the step one and completing the steps one to four again.

Step six: establishing a prediction model base;

and (5) reserving the model network architecture and the related set parameters in the third step and the fourth step to form a final prediction model library.

Step seven: acquiring a picture of a vertical face of an urban building;

the acquisition of the target building facade picture can be obtained by shooting, such as: hand-held phone/camera shooting, vehicle-mounted or onboard camera shooting; and can also be obtained through free map websites, such as: hundred-degree panoramic pictures, so that the time for collecting the pictures can be greatly saved.

Step eight: predicting the picture of the vertical face of the city building;

and (4) putting the picture obtained in the step seven into the model library obtained in the step six, wherein the prediction method is the same as the step four and specifically comprises the steps of outer wall identification, window identification and automatic missing element repair.

Step nine: obtaining the window-wall ratio of the urban building;

and (5) calculating the window-wall ratio of each target building according to the areas of the outer wall and the window finally predicted and repaired in the step eight by using the formula (1).

The invention has the following advantages:

1. the invention has lower implementation cost. The data acquisition of the invention only depends on shooting or free map websites, and the model training and the picture prediction can be completed by using a common computer, so the total cost is lower;

2. the invention has higher prediction precision. The method can achieve lower prediction error for most prediction samples, has the characteristic of automatically repairing incomplete windows and walls, and ensures high precision of the whole prediction process;

3. the invention can save a great deal of time and labor. According to the method, an effective prediction model library can be established through labeling of a small number of samples and model training, and then automatic prediction of window-wall ratios of massive buildings in cities is completed.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is an exemplary diagram of pixel level labeling;

FIG. 3 is a schematic diagram of the Unet architecture;

FIG. 4 is an illustration of an image prediction identification (a) predicting an original image; (b) recognizing and predicting the outer wall; (c) the outer window identifies the prediction.

Detailed Description

The present invention will be further illustrated with reference to the accompanying drawings and specific embodiments, which are to be understood as illustrative only and not limiting the scope of the invention.

The embodiment discloses an automatic identification method for a window-wall ratio of an urban building, and the following further explains the specific implementation mode according to fig. 1 to 4:

as shown in fig. 1, an automatic identification method for window-wall ratio of city building includes the following steps:

Step two: labeling a sample picture;

as shown in FIG. 2, the present invention employs pixel-level labeling; the pixel-level marking refers to marking along the outline of an object strictly, and in the invention, specifically, the marking is performed on the outlines of the building outer wall and each window in a shot picture one by one, and the marked objects are respectively given with labels: a "wall" or a "window". After the labeling is finished, an XML file (extensible markup Language) is exported and generated.

Step three: training a model;

Improvement of the Unet architecture (as shown in FIG. 3): the Unet architecture, also known as an encoder-decoder architecture, is composed of three parts, including feature extraction, upsampling, and channel; the first half is used for feature extraction (also called down-sampling, encoder part), the second half is up-sampling (decoder part), and the middle part is a channel for gradient circulation. The function of feature extraction is: the method comprises the steps of performing downsampling coding on an input picture to obtain high-resolution information of concentrated original picture information; the purpose of up-sampling is to recover picture information to obtain a required result; the purpose of the channel is to facilitate the model to better collect global information. The improvement of the Unet architecture is that: (1) the number of sampling layers is increased, bn layers are added, and the accuracy of feature extraction is improved; (2) the decoder calls the ResNet model parameters pre-trained in the ILSVRC match, so that the decoding precision can be improved, and the model training time can be greatly reduced.

Step four: sample picture prediction (as shown in fig. 4);

Step five: verifying the model precision;

η_i＝S_i，window/S_i，wall(1)

Step six: establishing a prediction model base;

Step seven: acquiring a picture of a vertical face of an urban building;

Step eight: predicting the picture of the vertical face of the city building;

and (4) putting the picture obtained in the step seven into the model library obtained in the step six, wherein the prediction method is the same as the step four and specifically comprises outer wall prediction, window prediction and automatic missing element repairing.

Step nine: obtaining the window-wall ratio of the urban building;

The method comprises the following specific steps:

1. and acquiring a sample picture. The investigation personnel take 85 training pictures meeting the requirement of the first step by using a mobile phone (with a camera function) in Hangzhou Xiaoshan district and each administrative district of Nanjing City. In addition, 43 pictures with actual window-wall ratios were taken for prediction based on government information;

2. and (5) training picture labeling. And (3) performing pixel-level labeling on the windows and the outer walls in the training pictures one by using a free software 'eidolon labeling assistant' (as shown in figure 2). Wherein, the marked windows are uniformly given a label of 'exterior window', and the marked exterior wall is given a label of 'exterior wall'. After the labeling is finished, exporting and generating an XML file;

3. and (5) picture recognition training. And (3) constructing the Unet architecture (shown in figure 3) set forth in the step three by using the python language, and carrying out related improvement. Importing an XML file to train and learn, and performing identification training of an outer wall by adding related sentences for executing Resize (zooming), and learning the characteristics of the position and the color of the outer wall; carrying out recognition training on the window by adding related sentences for executing Split, and learning the characteristics of the position and the color of the window;

4. and predicting a sample picture. The Unet improved structure built by using python language inputs the positions and color features of the learned outer wall and window, adds relevant sentences for executing Resize (scaling) and Split (splitting), and predicts the outer wall and window in the prediction sample respectively, and the result is shown in FIG. 4. And after prediction is finished, adding related sentences for executing the CRF algorithm to repair the outline of the outer wall and the outline of the window. And finally, automatically counting the areas of the outer wall and the window in each picture by using a computer, and calculating the window-wall ratio by using the formula (1):

η_i＝S_i，window/S_i，wall(1)

wherein i represents the ith prediction picture, S_i,windowRepresents the total area of the window in the ith prediction picture, S_i,wallRepresenting the area of the outer wall in the ith prediction picture;

5. and (5) error checking. The prediction error epsilon of each picture is counted one by one through a formula (2)_i：

ε_i＝|η_i′-η_i|/η_i(2)

Wherein, η_i，η_i' denotes the actual to predicted window wall ratio of the ith predicted picture, respectively.

Finally, 31 pictures among the 43 predicted pictures satisfy the error less than 6%, and 36 pictures satisfy the error less than 10%. The recognition error of the prediction sample which can reach 83.7 percent is less than 10 percent, and the specified precision requirement is met. In this case, a prediction model library may be established according to the model network architecture and the related setting parameters.

6. And (4) predicting the window-wall ratio of the urban building. The method comprises the following steps that the picture of the facade of the city building is obtained, and the picture of the facade of the target building can be obtained through a shooting mode, such as: hand-held phone/camera shooting, vehicle-mounted or onboard camera shooting; and can also be obtained through free map websites, such as: hundred-degree panoramic pictures, so that the time for collecting the pictures can be greatly saved. And acquiring a building facade photo of the target area, inputting the photo into a prediction model library for automatic identification and repair, and obtaining the window-wall ratio of each target building. Since the invention focuses on the method description and the prediction model base is established, the implementation of the prediction model base to the whole city is only a physical activity and is not described herein.

The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.

Claims

1. An automatic recognition method for window-wall ratio of city buildings is characterized in that: the method comprises the following steps:

step two: labeling a sample picture;

step three: training a model;

step four: predicting a sample picture;

step five: verifying the model precision;

step six: establishing a prediction model base;

step seven: acquiring a picture of a vertical face of an urban building;

step eight: predicting the picture of the vertical face of the city building;

step nine: and obtaining the window-wall ratio of the urban building.

2. The method for automatically identifying the window-wall ratio of an urban building as claimed in claim 1, wherein: the requirements of the sample building facade shooting in the step one comprise: shooting the vertical surface of the target at a front view angle without looking up or inclining; the vertical face shooting is carried out without shielding; and shooting the building facade in short distance.

3. The method for automatically identifying the window-wall ratio of an urban building as claimed in claim 1, wherein: and marking the sample picture in the second step by adopting pixel level marking.

4. The method for automatically identifying the window-wall ratio of an urban building as claimed in claim 1, wherein: the model training infrastructure in the third step is a Unet architecture; the Unet architecture comprises feature extraction, upsampling and channels; the first half part is used for feature extraction, the second half part is used for up-sampling, and the middle part is used for a channel for gradient circulation; the Unet framework increases the number of sampling layers, the bn layer is added, and a decoder calls ResNet model parameters pre-trained in an ILSVRC match.

5. The method for automatically identifying the window-wall ratio of an urban building as claimed in claim 4, wherein: compressing the outer wall in the third step into the Unet network in a scaling mode for recognition training; and (4) the window is processed by window scanning, and the divided small blocks are respectively put into a Unet network for recognition training.

6. The method for automatically identifying the window-wall ratio of an urban building as claimed in claim 1, wherein: the outer wall prediction in the fourth step compresses the picture in a scaling mode, the picture is put into a model for prediction, and the picture is reduced to the original size after a result is obtained; the window prediction divides the picture in a window scanning mode, all the divided pictures are put into a model for prediction, and the pictures are spliced to the original size after a result is obtained; and for repairing the blocked defective elements, repairing based on a CRF algorithm is adopted.

7. The method for automatically identifying the window-wall ratio of the urban building as claimed in claim 1, wherein after the step four, the computer automatically counts the area of the outer wall and the window on each prediction image, and calculates the window-wall ratio η according to the formula (1)_i：

η_i＝S_i，window/S_i，wall(1)

Wherein i represents the ith prediction picture, S_i,windowRepresents the total area of the window in the ith prediction picture, S_i,wallRepresenting the area of the outer wall in the ith prediction picture; the model precision verification requirement in the step five is as follows: if the error of more than 80% of the prediction samples is less than 10%, the precision is considered to meet the requirement, and the step six is executed; otherwise, returning to the step one and completing the steps one to four again.

8. The method for automatically identifying the window-wall ratio of an urban building as claimed in claim 1, wherein: and the prediction model base in the sixth step is established according to the model network architectures and the set parameters in the third step and the fourth step.

9. The method for automatically identifying the window-wall ratio of an urban building as claimed in claim 1, wherein: the urban building facade picture prediction in the step eight is to place pictures into the model prediction library established in the step six; the outer wall prediction compresses the picture to a specified size in a scaling mode, the picture is placed into a model prediction library for prediction, and the picture is reduced to the original size after a result is obtained; the window prediction divides the picture in a window scanning mode, and the division size is consistent with the specified size; then, putting all the divided pictures into a model prediction library for prediction, and splicing the pictures to the original size after obtaining a result; and repairing the defective elements by adopting CRF algorithm.

10. The method for automatically identifying the window-wall ratio of an urban building as claimed in claim 7, wherein: and obtaining the window-wall ratio of the urban building in the ninth step according to the area of the outer wall and the window obtained after the prediction and the repair in the eighth step, and calculating by using a formula (1) to obtain the window-wall ratio of each target building.