CN113283366A

CN113283366A - Building side information extraction method based on learning vector quantization algorithm

Info

Publication number: CN113283366A
Application number: CN202110630283.8A
Authority: CN
Inventors: 田鹏飞; 孙伟; 吴丹
Original assignee: Yijing Zhilian Beijing Technology Co Ltd
Current assignee: Yijing Zhilian Beijing Technology Co Ltd
Priority date: 2021-06-07
Filing date: 2021-06-07
Publication date: 2021-08-20

Abstract

The invention discloses a building side information extraction method based on a learning vector quantization algorithm, which comprises the steps of constructing a learning vector quantization neural network training set; setting sampling parameters and characteristic vectors; initializing a weight and a learning rate; calculating the Euclidean distance; recording neuron class labels of an output layer; updating the weight value of the class label; outputting a side information clustering result; the invention adopts a learning vector quantization algorithm to realize the building side information extraction method, and uses a small amount of weight vectors to represent the topological structure of data, so that the LVQ algorithm has wider application in the field of pattern recognition, and meanwhile, the extraction precision can be rapidly improved in the building side information extraction.

Description

Building side information extraction method based on learning vector quantization algorithm

Technical Field

The invention relates to the technical field of building information extraction, in particular to a method for extracting building side information based on a learning vector quantization algorithm.

Background

With the development of the multisource remote sensing technology, high-resolution remote sensing images in the same region and different time phases can be quickly and conveniently acquired at the present stage, the images have higher-resolution ground feature information and richer ground feature spectral information, space data support is provided for extracting building side information by utilizing the information, a learning vector quantization algorithm is a supervised neural network classification method with simple structure and strong function, the supervised neural network classification method is successfully applied to multiple fields of statistics, pattern recognition, machine learning and the like as a nearest neighbor prototype classifier, LVQ is used as a nearest neighbor prototype classifier through continuously updating a neuron weight vector (prototype) in the training process and continuously adjusting the learning rate of the neuron weight vector (prototype), the boundary between different types of weight vectors can be gradually converged to a Bayesian classification boundary, and in the algorithm, the selection of a winning neuron (nearest neighbor weight vector) is judged by calculating the Euclidean distance between an input sample and the weight vector, the most outstanding advantage of the LVQ is that the LVQ has self-adaptability, and a codebook of training samples can be obtained in an online learning mode;

at present, in the process of obtaining the side information of the building, the side information of the building is extracted through a high-resolution remote sensing image, so that the accuracy of information extraction in the extraction process is low, and meanwhile, the side information of the building is extracted through a clustering algorithm and manual extraction of characteristic information, so that the operation is complex, the accuracy of information extraction cannot be ensured, and the existing extraction mode cannot meet the requirements of popularization and application at present.

Disclosure of Invention

The invention provides a method for extracting building side information based on a learning vector quantization algorithm, which can effectively solve the problems that the accuracy of information extraction in the extraction process is low because the building side information is extracted by a high-resolution remote sensing image singly in the process of obtaining the building side information, the building side information is extracted by a clustering algorithm and manually extracting characteristic information, the operation is complicated, the accuracy of information extraction cannot be ensured, and the existing extraction mode cannot meet the requirements of popularization and application at present.

In order to achieve the purpose, the invention provides the following technical scheme: a building side information extraction method based on a learning vector quantization algorithm specifically comprises the following extraction steps:

s1, constructing a learning vector quantization neural network training set;

s2, setting sampling parameters and feature vectors;

s3, initializing a weight and a learning rate;

s4, calculating the Euclidean distance;

s5, recording neuron labels of an output layer;

s6, updating the weight value of the class label;

s7, outputting a side information clustering result;

and S8, sampling and evaluating accuracy.

According to the technical scheme, in the S1, the step of constructing the learning vector quantization neural network training set refers to the step of preprocessing the building high-resolution remote sensing image data of the information to be extracted, after preprocessing, the step of performing target image cutting on the remote sensing image data information of the building, and establishing building category labels;

during class labeling, the side face of the building is set as a foreground face, other faces of the building are set as background faces, and meanwhile, a single-channel gray image is output to serve as a network training set.

According to the above technical solution, in S2, the setting of the sampling parameters and the feature vectors means that the original picture and the tagged picture are circularly input according to the pixel interval and the window size of the determined block, the number of blocks is calculated at the same time, the positive and negative sample quantities are calculated in units of blocks, and then the positive and negative sample quantities are balanced by uniformly sampling the samples.

According to the above technical solution, in S3, the initialization weight and the learning rate specifically refer to a weight and a learning rate between an initialization input layer and a competition layer, the input layer is directly connected to the output layer, each node of the output layer has a weight vector connected to it, and the purpose of learning is to find an optimal value of the weight.

According to the above technical solution, in S4, the calculating the euclidean distance specifically includes adding the input vector to the input layer to calculate the euclidean distance between the neuron in the competition layer and the input vector;

the euclidean distance is a commonly used distance definition, and refers to the true distance between two points in a multidimensional space, or the natural length of a vector, i.e., the distance from the point to the origin, and the euclidean distance in two-dimensional and three-dimensional spaces is the actual distance between the two points.

According to the above technical solution, in S5, the recording of the output layer neuron class label specifically refers to selecting a competition layer neuron with a minimum distance from an input vector, where each neuron of the input layer corresponds to an input feature, and a plurality of neurons arranged in parallel correspond to input multidimensional feature vectors, and the input layer is fully interconnected with the hidden layer by a weight, that is, each neuron of the input layer is connected to each neuron of the hidden layer, the multidimensional feature vectors and the plurality of hidden layer neurons determine a weight matrix, and each row vector corresponds to one neuron of the hidden layer;

and a plurality of neurons of the hidden layer form distribution in a multidimensional characteristic space to form a classified class center, and each neuron of the hidden layer corresponds to an output layer neuron representing the final classification through a weight matrix and is recorded as an output layer neuron class label connected with the neuron.

According to the above technical solution, in S6, updating the weight of the class label means updating the weight of the class label corresponding to the input vector, and the neural network continuously adjusts the connection weight by learning the training sample data, thereby continuously adjusting the learning rate.

According to the above technical solution, in S7, outputting the side information clustering result specifically means outputting the side information clustering result after the specified learning rate and iteration number and the straight road reaches the preset iteration number requirement or precision requirement.

According to the technical scheme, in S8, the sampling evaluation accuracy specifically refers to an accuracy of the LVQ learning evaluated in a sampling manner, an evaluation threshold is determined first when the accuracy is evaluated, and after the accuracy meets a threshold requirement, a complete building side information result is output in the area, so that the building side information extraction of the whole area is completed.

Compared with the prior art, the invention has the beneficial effects that:

the invention adopts a learning vector quantization algorithm to realize the building side information extraction method, utilizes a small amount of weight vectors to represent the topological structure of data, compared with an unsupervised self-organizing neural network algorithm, because a supervision signal is introduced in the weight vector updating process, the LVQ algorithm has wider application in the field of pattern recognition, and simultaneously can quickly improve the extraction precision in the building side information extraction.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.

In the drawings:

FIG. 1 is a flow chart of the steps of the extraction method of the present invention;

fig. 2 is a diagram of a learning vector quantization neural network structure of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.

Example (b): as shown in fig. 1-2, the present invention provides a technical solution, a method for extracting building side information based on a learning vector quantization algorithm, which specifically includes the following extraction steps:

s1, constructing a learning vector quantization neural network training set;

s2, setting sampling parameters and feature vectors;

s3, initializing a weight and a learning rate;

s4, calculating the Euclidean distance;

s5, recording neuron labels of an output layer;

s6, updating the weight value of the class label;

s7, outputting a side information clustering result;

and S8, sampling and evaluating accuracy.

Based on the technical scheme, in S1, the step of constructing the learning vector quantization neural network training set is to preprocess the building high-resolution remote sensing image data of the information to be extracted, perform target image cutting on the remote sensing image data information of the building after preprocessing, and establish building category labeling;

Based on the above technical solution, in S2, setting the sampling parameters and the feature vectors means that the original picture and the tagged picture are circularly input according to the pixel interval and the window size of the determined block, the number of blocks is calculated at the same time, the positive and negative sample quantities are calculated by using the blocks as units, and then the positive and negative sample quantities are balanced by uniformly sampling the samples.

Based on the above technical solution, in S3, initializing the weight and the learning rate specifically refer to initializing the weight and the learning rate between the input layer and the competition layer, where the input layer is directly connected to the output layer, each node in the output layer has a weight vector connected to it, and the purpose of learning is to find the optimal value of the weight.

Based on the above technical solution, in S4, calculating the euclidean distance specifically means adding the input vector to the input layer to calculate the euclidean distance between the neuron in the competition layer and the input vector;

Based on the above technical solution, in S5, recording the output layer neuron class label specifically means selecting a competition layer neuron with the minimum distance from the input vector, where each neuron of the input layer corresponds to an input feature, and a plurality of neurons arranged in parallel correspond to input multidimensional feature vectors, and the input layer is fully interconnected with the hidden layer by a weight, that is, each neuron of the input layer is connected to each neuron of the hidden layer, the multidimensional feature vectors and the plurality of hidden layer neurons determine a weight matrix, and each row vector corresponds to one neuron of the hidden layer;

Based on the above technical solution, in S6, updating the weight of the class label means updating the weight according to the class label corresponding to the input vector, and the neural network continuously adjusts the connection weight by learning the training sample data, thereby continuously adjusting the learning rate.

Based on the above technical solution, in S7, outputting the side information clustering result specifically means outputting the side information clustering result after the specified learning rate and iteration number and the straight road reaches the preset iteration number requirement or precision requirement.

Based on the above technical scheme, in S8, the sampling evaluation accuracy specifically refers to an accuracy of the LVQ learning evaluated in a sampling manner, an evaluation threshold is determined first when the accuracy is evaluated, and after the accuracy meets a threshold requirement, a complete building side information result is output in this area, so as to achieve the purpose of extracting the building side information of the whole area.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A building side information extraction method based on a learning vector quantization algorithm is characterized by comprising the following steps: the method specifically comprises the following extraction steps:

s1, constructing a learning vector quantization neural network training set;

s2, setting sampling parameters and feature vectors;

s3, initializing a weight and a learning rate;

s4, calculating the Euclidean distance;

s5, recording neuron labels of an output layer;

s6, updating the weight value of the class label;

s7, outputting a side information clustering result;

and S8, sampling and evaluating accuracy.

2. The method for extracting the building side information based on the learning vector quantization algorithm according to claim 1, wherein: in the step S1, the step of constructing the learning vector quantization neural network training set refers to the step of preprocessing the building high-resolution remote sensing image data of the information to be extracted, and after preprocessing, the step of performing target image cutting on the remote sensing image data information of the building and establishing building category labels;

3. The method for extracting the building side information based on the learning vector quantization algorithm according to claim 1, wherein: in S2, setting the sampling parameters and the eigenvectors refers to circularly inputting the original picture and the tagged picture according to the determined pixel interval and window size of the block, calculating the number of blocks, calculating the positive and negative sample quantities in units of blocks, and then balancing the positive and negative sample quantities by uniformly sampling the samples.

4. The method for extracting the building side information based on the learning vector quantization algorithm according to claim 1, wherein: in S3, the initialization weight and the learning rate specifically refer to a weight and a learning rate between an initialization input layer and a competition layer, the input layer is directly connected to the output layer, each node of the output layer has a weight vector connected to it, and the purpose of learning is to find an optimal value of the weight.

5. The method for extracting the building side information based on the learning vector quantization algorithm according to claim 1, wherein: in S4, calculating the euclidean distance specifically includes adding the input vector to the input layer to calculate the euclidean distance between the neuron in the competition layer and the input vector;

6. The method for extracting the building side information based on the learning vector quantization algorithm according to claim 1, wherein: in S5, recording the output layer neuron class label specifically means selecting a competition layer neuron with the smallest distance from the input vector, where each neuron of the input layer corresponds to an input feature, and a plurality of neurons arranged in parallel correspond to input multidimensional feature vectors, and the input layer is fully interconnected with the hidden layer by a weight, that is, each neuron of the input layer is connected to each neuron of the hidden layer, and the multidimensional feature vectors and the plurality of hidden layer neurons determine a weight matrix, and each row vector corresponds to one neuron of the hidden layer;

7. The method for extracting the building side information based on the learning vector quantization algorithm according to claim 1, wherein: in S6, updating the weight value of the class label means updating the weight value according to the class label corresponding to the input vector, and the neural network continuously adjusts the connection weight value by learning the training sample data, thereby continuously adjusting the learning rate.

8. The method for extracting the building side information based on the learning vector quantization algorithm according to claim 1, wherein: in S7, outputting the side information clustering result specifically means outputting the side information clustering result after the specified learning rate and iteration count and the straight track reaches a preset iteration count requirement or precision requirement.

9. The method for extracting the building side information based on the learning vector quantization algorithm according to claim 1, wherein: in S8, the sampling evaluation accuracy specifically refers to an accuracy of the LVQ learning evaluated in a sampling manner, an evaluation threshold is determined first when the accuracy is evaluated, and after the accuracy meets a threshold requirement, a complete building side information result is output in this area, so as to complete the extraction of the building side information in the whole area.