CN109635748B - Method for extracting road characteristics in high-resolution image - Google Patents

Method for extracting road characteristics in high-resolution image Download PDF

Info

Publication number
CN109635748B
CN109635748B CN201811532429.XA CN201811532429A CN109635748B CN 109635748 B CN109635748 B CN 109635748B CN 201811532429 A CN201811532429 A CN 201811532429A CN 109635748 B CN109635748 B CN 109635748B
Authority
CN
China
Prior art keywords
sample
generator
road
discriminator
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811532429.XA
Other languages
Chinese (zh)
Other versions
CN109635748A (en
Inventor
邓钰桥
林报嘉
刘艾涵
邱中原
王琼
罗治敏
郑鑫臻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Highway Engineering Consultants Corp
CHECC Data Co Ltd
Original Assignee
China Highway Engineering Consultants Corp
CHECC Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Highway Engineering Consultants Corp, CHECC Data Co Ltd filed Critical China Highway Engineering Consultants Corp
Priority to CN201811532429.XA priority Critical patent/CN109635748B/en
Publication of CN109635748A publication Critical patent/CN109635748A/en
Application granted granted Critical
Publication of CN109635748B publication Critical patent/CN109635748B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/182Network patterns, e.g. roads or rivers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A method for extracting road features in high-resolution images. The generator G capable of accurately identifying the road characteristics is obtained through optimization of the generator G, the discriminator D and the generation countermeasure network V (D, G) structure and training of parameters. In the training process, the invention evaluates and generates a sample P by utilizing Wasserstein distanceg(x) And the real sample Pdata(x) The difference can effectively avoid the problem of unconvergence of the gradient of the loss function when the existing KL divergence or JS divergence is evaluated. The generator G thus obtained, which identifies the resulting road characteristics, is closer to the actual situation, with greater accuracy.

Description

Method for extracting road characteristics in high-resolution image
Technical Field
The invention relates to the field of image processing, in particular to a method for extracting road characteristics in a high-resolution image.
Background
The continuous high-speed development of high-resolution remote sensing technology and aerial photography acquisition means drives the wide application of high-resolution images. The obtained remote sensing observation image with the ground resolution of less than 10m is also very suitable for civil fields of urban traffic planning, environmental monitoring, and the like.
By 2018, a plurality of high-molecular satellites are put into use. The data center of the traffic transportation industry of the high-resolution earth observation system synchronously supports high-resolution data of the external open traffic transportation industry. It can be expected that the combination of "high score + traffic" will further promote the level of fine management of road resources.
The road is one of the most basic land feature information in the GIS field, is not only an important national asset, but also a key link for communicating regional economy and resident travel. The real-time and effective grasp of the road current situation and the change situation thereof is very important. The road geographic characteristics need to be informationized to acquire the road current situation and the change situation thereof, so that the optimal management and utilization of road resources are realized.
Because a complete data reporting flow is not available in early road construction and finishing work, especially for low-grade roads in remote rural areas, data support required by informatization is lacking. In order to manage the road resources, a technology of extracting road features through high-resolution images is developed. That is, the road resource manager is expected to extract effective road basic data from the existing data source reflecting road geographic information according to road characteristics to form an information resource capable of being remotely managed.
The automatic road extraction based on the high-resolution images is always a research focus of scholars at home and abroad because the automatic road extraction can greatly save manpower and physical strength. Scholars at home and abroad also put forward various feasible schemes for the requirement. In recent years, the deep development of deep learning research also provides new ideas and inspiration for road network extraction work, and the artificial intelligence technology which makes a great breakthrough in the field of image processing provides new ideas and inspiration for road extraction work. Because the deep learning technology is trained on a large number of samples, the machine can have learning ability, common features are searched from objects, and decision judgment is made by means of the trained model. The application of the deep learning technology further improves the accuracy and efficiency of recognition and detection.
At present, the mainstream data source for extracting the road features is various types of satellite remote sensing images, that is, one or more types of features meeting specific conditions are searched from a digital image, and the geographic information of the road is restored from the pixels where the features are located. In 1995, the common road characteristics were summarized in the Vosselman and Knecht studies as four categories:
1) geometric characteristics: the road feature is a natural feature in geometry. The road width is reflected in the image, the change process is slow along the extending direction of the road for keeping a long and thin strip line segment with a certain width, the road width reflects the road grade in a certain degree, high-grade roads are more straight and regular, and low-grade roads are more meandering and rugged;
2) spectral characteristics: the nature of road features in material. The reflection is that the color distribution inside the road in the same scene remote sensing image is uniform, the spectral characteristics are close, and a certain contrast exists between the road and the scenery on the two sides of the road. A more obvious spectral boundary is easily observed under high resolution;
3) topological characteristics: the road topological characteristic is structurally related and reflected in the image, and is characterized in that two roads may have X, Y or T-shaped intersection, the included angle of the intersection is limited by a certain degree, and meanwhile, the roads are connected with one another to form a whole road network;
4) context characteristics: other elements in the image, which can be used to assist in identifying road features, are associated features of the road features. Such as continuous vegetation or building information over an area, village or town information over the whole world.
Remote sensing image-based road extraction studies began in the last 70 th century. However, since the early computer operation level and digital image processing technology are relatively backward, the hardware calculation condition cannot cope with the task of extracting a large sample amount of the remote sensing image. Early remote sensing images are extracted by more manual methods. With the improvement of the performance of computing equipment and the deep development of research theories, various road feature identification methods based on image feature extraction are proposed by many scholars at home and abroad.
According to the difference of characteristic properties depending on road characteristic extraction, the existing extraction methods can be divided into pixel-based methods and object-oriented methods. A road characteristic extraction method based on pixels is based on the geometric characteristics of roads, takes image pixels as basic units, starts from the directions of wave spectrums, space characteristics and the like, finds pixel areas meeting requirements, and accordingly segments and identifies road positions. The object-oriented road feature extraction method relies on the spectral features of roads, and carries out modeling research by taking the roads as a whole, so that data and operation are concentrated in the objects, namely, the roads are identified from images.
Although the conventional road feature extraction method effectively reduces the cost of manual extraction and greatly improves the extraction efficiency, the recognition accuracy of the conventional road feature extraction method is still far away from the accuracy of a manual recognition method. In particular, on a road section with a block or a shadow, the extraction result generated by the above conventional method still needs manual post-processing for repairing. In particular:
1) for roads in a specific area or type, although various traditional methods can achieve good effects, the traditional methods lack universality, special parameter design needs to be performed for the roads in the area or type, and the migration application effect is poor;
2) although various methods can achieve certain effects in the face of low-resolution and medium-resolution remote sensing images, the general application effect is not good in the face of the fine extraction requirement provided by the high-resolution remote sensing image;
3) the traditional method cannot fully utilize various features in the remote sensing image to carry out comprehensive judgment, and the traditional method usually focuses on processing partial features and layers in the remote sensing impression;
4) the automation degree is low, and manual pretreatment and manual post-treatment are frequently involved.
Disclosure of Invention
In order to solve the defects in the prior art, the invention aims to provide the method for extracting the road features in the high-resolution image, which has simple requirements on image preprocessing, more accurate recognition effect and can adapt to various image scenes.
Firstly, in order to achieve the above object, a method for extracting road features in a high-resolution image is provided, which includes the steps of: the method comprises the steps of firstly, obtaining a remote sensing image, and slicing the remote sensing image according to a fixed size to obtain a remote sensing image slice x; and secondly, inputting the remote sensing image slices into a trained generator G for forward propagation operation to obtain road characteristics therein, and outputting the road characteristics. In the second step, the generator G is obtained by training according to the following steps: step s1, slicing the sample image according to a fixed size to obtain image slice samples z, and marking road elements in each image slice sample z; step s2, constructing a generator G and a discriminator D, and initializing to generate an antagonistic network V (D, G); wherein the generator G is a residual error network, the discriminator D is a convolution network, and the loss function for constructing the generator G is
Figure BDA0001906015890000041
Constructing a loss function of a discriminator D as- ((1-t) log (1- (D (G (z)) + ylogD (x))), wherein t-1 represents that the input is a remote sensing image slice, and t-0 represents that the input is a sample image slice;
Figure BDA0001906015890000042
representing the output result of the rounded convolution network;
Figure BDA0001906015890000043
the threshold value is generally 0.5, namely, the model is determined when the output result is greater than 0.5Judging that the input data is a real sample, rounding up to 1, otherwise, confirming that the input data is a generating sample, and rounding down to 0, wherein the two items are set because the source of input distribution and the feedback of an output result need to be considered during the training of a generator; step s3, making an optimization goal
Figure BDA0001906015890000044
Wherein the content of the first and second substances,
Figure BDA0001906015890000045
for optimizing the function and its optimization direction, Pdata(x) Representing the distribution of all remote sensing image slices x as a true sample, Pz(z) represents the prior distribution of the image slice sample z, E represents the loss function of the overall data distribution in the training process, then
Figure BDA0001906015890000051
Figure BDA0001906015890000052
Wherein, Pg(x) A distribution of generated samples obtained for the generator G; step s4, evaluating the generated sample P using the Wasserstein distanceg(x) And the real sample Pdata(x) A difference of (a); wherein, the Wasserstein distance W (P)data(x)',Pg(x) Is prepared from
Figure BDA0001906015890000053
Wherein, Pdata(x) ' represents Pg(x) And Pdata(x) All possible combined distribution sets are combined, single sampling (x, G (z)) gamma obtains a real sample x and a generated sample G (z), | | x-G (z) | | is the distance between the real sample x and the generated sample G (z), E(x,G(z))~γ[||x-G(z)||]An expected value for the distance between the real sample x and the generated sample g (z); step s6, the marked image slice sample is input into the generator G, based on
Figure BDA0001906015890000054
Calculating the loss of the generator according to
Figure BDA0001906015890000055
Calculating a loss of the discriminator; step s7, performing BP back propagation operation on the loss obtained by performing forward propagation operation on the generator G in step s 6; respectively training the generator and the discriminator alternately to optimize network parameters; and step s8, repeating the steps s6 to s7, training the generator and the discriminator, optimizing the network parameters of the generator and the discriminator until the generator G and the discriminator D reach Nash equilibrium or the loss of the generator G and the discriminator D is unchanged, and outputting the generator G at the moment as the trained generator G.
Optionally, in the above extraction method, in the step s8, the discriminator D reaches nash equilibrium as D (g (z)) approximately equal to 0.5.
Optionally, in the above extraction method, in the second step, the generator G, the discriminator D, and the generation countermeasure network V (D, G) are respectively constructed with different parameters and structures, and training in steps s2 to s8 is performed to obtain different generators G; and then selecting a group of generators with superiority from the different generators G, setting different weights according to the parameters and the structures of the generators, fusing and recombining the generators according to the weights of the generators, taking a final generator G 'formed after fusion as a trained generator G, and carrying out forward propagation operation by using the final generator G' to obtain the road characteristics.
Optionally, in the above extraction method, the identifier D has a structural form of
Figure BDA0001906015890000061
Optionally, in the above extraction method, in the step s1, the marking of the road element in the image slice sample z specifically includes: the serial number of the road element, the width of the road element, the material of the road element and the environment to which the road element belongs.
Optionally, in the above extraction method, in the step s1, the proportion of the positive sample and the negative sample in the image slice sample z is close.
Optionally, in the above extraction method, neither the image slice sample z nor the remote sensing image slice is subjected to color homogenization.
Optionally, in the above extraction method, the image slice sample z or the remote sensing image slice is a binary image with the same size.
Advantageous effects
The invention obtains the generator G capable of accurately identifying the road characteristics by optimizing the structure of the generator G, the discriminator D and the generation countermeasure network V (D, G) and training parameters. In the training process, positive and negative samples which are close in proportion and are not subjected to color homogenizing treatment are selected, and the samples are marked according to the characteristics of roads. Thereby identifying road features at a level closer to reality and with greater accuracy.
Further, in the calculation of the loss function, the generated sample P is evaluated in consideration of the KL divergence or JS divergence that is generally utilizedg(x) And the real sample Pdata(x) When the difference (c) is (c), P is causedg(x) And Pdata(x) When the two distributions are in a small overlapping area, the gradient of the loss function of the generator is fixed to be a constant, and the subsequent training process is influenced. The present invention therefore uses the Wasserstein distance instead of these two divergences to evaluate the sample Pg(x) And the real sample Pdata(x) The difference in (a). Since the Wasserstein distance can still smoothly reflect the difference between the two distributions when the two distributions are not overlapped, the problem of non-convergence of model parameters can not occur in the training process of the invention.
Furthermore, the invention also utilizes different parameters and structures to construct the generator G, the discriminator D and the generation antagonizing network V (D, G) to carry out fusion of different generators. The fusion process can fully absorb the advantages of different parameters and model structures, and filter random noise through different generators while keeping a necessary road, so that the image clarity of a generated result is ensured, and the generation precision can be improved as much as possible.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a schematic flow chart of a road feature extraction method according to the present invention;
FIG. 2 is a diagram of a conditionally generating countermeasure network model architecture according to the present invention;
FIG. 3 is a WGAN model fusion framework structure according to the invention
FIG. 4 is a road feature extracted by WGAN model fusion according to the invention;
FIG. 5 is a diagram of a data source tile obtained in one particular application scenario;
FIG. 6 illustrates road features marked in the data source tile;
FIG. 7 is a road characteristic data marked by the data source tile;
FIG. 8 is a sample data set obtained from the tagging in the data source tile;
fig. 9 shows road features extracted by the WGAN model described above.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Fig. 1 is a method for extracting road features in a high resolution image according to the present invention, which includes the steps of:
firstly, obtaining a remote sensing image, and slicing the remote sensing image according to a fixed size to obtain a remote sensing image slice x;
and secondly, inputting the remote sensing image slices into a trained generator G for forward propagation operation to obtain road characteristics therein, and outputting the road characteristics.
The extraction method can extract road elements from the original high-resolution remote sensing image, is obtained by optimizing and improving related links on the basis of a WGAN model, and mainly comprises the following innovation points:
1: WGAN structure optimization and sample marking process
2: WGAN sample countermeasure flow
3: WGAN model fusion protocol.
The above innovation points are analyzed one by one.
1: the invention mainly improves the process of marking the sample calculated by WGAN, and comprises the following steps: the deep learning model applied by the method takes a standard WGAN structure as a prototype, and optimizes and adjusts a network structure and a training process according to a road extraction task and high-resolution image characteristics. The structure is shown in fig. 2. The structure contains the basic features of a standard GAN structure, namely the presence of a generator G and a discriminator D. In order to better learn the potential principle of road feature and data generation of high-resolution images of each level, a generator G in a model used in the patent is composed of a more complex residual error network, and a road element distribution sample G (z) generated according to the original image feature is obtained by taking a large number of remote sensing image slice samples z with the picture size of 512 multiplied by 512 as input into the generator. Correspondingly, a real road element distribution sample obtained by manual labeling according to the sample z is recorded as x, the two types of distribution samples are represented by binary images with the same size as the original image slice, wherein the road element is used as a positive sample and is represented as 1 in the image, and the non-road element is used as a negative sample and is represented as 0 in the image. The discriminator D is composed of a basic convolutional network, and is used to judge whether the data is from the generated sample g (z) or the true sample x, and finally output a discrimination result, i.e. the probability that the sample data input into the discriminator is the true data. Therefore, a complete WGAN model structure is obtained, the application process of the model comprises two parts of training and extracting, wherein the training part comprises three links of sample set manufacturing, sample training and model fusion.
2: like other deep learning models, the WGAN model can be put into an actual scene for application only after a large number of sample trainings, and making a sample set meeting the model structure requirements is a necessary condition for sample training. Therefore, in the present invention, for the second step, the generator G may be specifically trained according to the following steps to fully perform WGAN sample confrontation, so as to obtain a generator G with a more ideal feature extraction effect:
step s1, slicing the sample image according to a fixed size to obtain image slice samples z, and marking road elements in each image slice sample z;
step s2, constructing a generator G and a discriminator D, and initializing to generate an antagonistic network V (D, G); wherein the generator G is a residual error network, the discriminator D is a convolution network, and the loss function for constructing the generator G is
Figure BDA0001906015890000091
Constructing a loss function of a discriminator D as- ((1-t) log (1- (D (G (z)) + ylogD (x))), wherein t-1 represents that the input is a remote sensing image slice, and t-0 represents that the input is a sample image slice;
Figure BDA0001906015890000092
representing the output result of the rounded convolution network;
step s3, making an optimization goal
Figure BDA0001906015890000093
Wherein the content of the first and second substances,
Figure BDA0001906015890000094
for the optimization function and its optimization direction, Pdata(x) Representing the distribution of all remote sensing image slices x as a true sample, Pz(z) represents the prior distribution of the image slice sample z, E represents the loss function of the overall data distribution in the training process, then
Figure BDA0001906015890000095
Figure BDA0001906015890000096
Wherein, Pg(x) Obtained for the generator GGenerating a distribution of the samples;
step s4, evaluating the generated sample P using the Wasserstein distanceg(x) And the real sample Pdata(x) The difference distance of (a); wherein, the Wasserstein distance W (P)data(x)',Pg(x) Is prepared from
Figure BDA0001906015890000101
Figure BDA0001906015890000102
Wherein, Pdata(x) ' represents Pg(x) And Pdata(x) Combined set of all possible joint distributions, single sampling (x, G (z)) gamma to obtain a real sample x and a generated sample G (z), | x-G (z) | | is the distance between the real sample x and the generated sample G (z), E(x,G(z))~γ[||x-G(z)||]An expected value for the distance between the real sample x and the generated sample g (z);
step s6, the marked image slice sample is input into the generator G, based on
Figure BDA0001906015890000103
Calculating the loss of the generator according to
Figure BDA0001906015890000104
Calculating a loss of the discriminator;
step s7, performing BP back propagation operation on the loss obtained by performing forward propagation operation on the generator G in step s 6; respectively training the generator and the discriminator alternately to optimize network parameters;
and step s8, repeating the steps s6 to s7, training the generator and the discriminator, optimizing the network parameters until the generator G and the discriminator D reach Nash equilibrium or the loss of the generator G and the discriminator D is unchanged, and outputting the generator G at the moment as the trained generator G.
3: on the basis, the application effect of the deep learning model is closely inseparable from the training degree of the sample, and for the road extraction target, the types of the manufactured training samples which contain landform and road grade in a concentrated mode are richer, the migration application effect of the model is better, the number of the training samples in the same type is larger, and the road extraction effect of the model is better. However, due to the complexity in the remote sensing image, the situations of false recognition, undetected situation and the like are still unavoidable, and even if the same training sample is used, the optimized results have larger difference for different training parameters and model structures. Different training parameters and model structures are often beneficial and disadvantageous, and for this reason the invention specifically makes a model fusion scheme based on WGAN:
referring to fig. 3, the present invention obtains a plurality of training results by performing multiple training on the same sample with models of different parameters and structures, selects a result with superiority from the training results, performs fusion and recombination according to the original parameters and the structure setting weights, and finally obtains a fused model. By means of a reasonable model fusion process, the advantages of different parameters and model structures can be fully absorbed, random noise is filtered while necessary roads are kept, the image clarity of a generated result is guaranteed, the generation precision is improved as much as possible, a generator model after fusion is a final model applied to a specific road element extraction task, and the effect is shown in fig. 4.
In a specific implementation of the invention:
and taking partial tiles in the nine-scene image in the high-resolution second image in the city of mansion as sample data sources to extract the road characteristics.
Refer to the data source tile shown in FIG. 5. In order to enable the model to have better adaptability, the remote sensing image is not subjected to color homogenizing treatment. The sample slices in each scene image are labeled in the ArcGIS software to form a real road element data distribution sample x of the sample, namely a remote sensing image slice, as shown in fig. 6. Meanwhile, in order to respectively count the extraction accuracy of the generated model under different environments and road element categories, attribute labeling is performed on road elements in x, and data obtained by labeling is shown in fig. 7. The parameters are specifically shown in table 1, and include:
TABLE 1 annotated parameter Table
Figure BDA0001906015890000111
Figure BDA0001906015890000112
Figure BDA0001906015890000121
And then, 3000 remote sensing image slices with 512-512 resolution, which cover various landforms and contain road elements, are selected as training data z according to model requirements. In order to obtain a better training effect for the model, the sample should be selected to ensure that the positive and negative sample ratios in the slice region should not differ too much, and an example of the finally obtained sample data set is shown in fig. 8.
Therefore, the invention trains according to the samples and obtains the generator G capable of accurately identifying the road characteristics through the WGAN sample confrontation. The purpose of sample training is to enable the model to optimize the structure of the model in a large number of learning processes, and finally obtain a generator with good adaptability and accuracy rate for generating road elements according to remote sensing images. In the actual training process, the generator and the discriminator need to be respectively and intensively trained by utilizing the prepared sample data set, so that the generator and the discriminator can be continuously optimized and upgraded in mutual confrontation, and the specific process of model training is as follows:
step 1: setting a loss function: the loss function is used for evaluating the variation trend of the difference between the generated data and the actual data in the training process so as to reflect whether the training optimization of the model is adjusted towards an expected target, and firstly, the loss functions are respectively set for the training processes of the discriminator and the generator according to a standard GAN model, and are respectively shown as formulas 1 and 2:
-((1-t)log(1-(D(G(z)))+ylog D(x)) (1)
Figure BDA0001906015890000122
wherein t represents an input data source in the current optimization process, and when the input data is from a real sample, namely a remote sensing image slice, t is 1;
Figure BDA0001906015890000123
the threshold value is generally 0.5, that is, when the output result is greater than 0.5, the model is determined to determine that the input data is a real sample, and the rounding is performed upwards to 1, otherwise, the input data is determined to be a generated sample, and the rounding is performed downwards to 0, which are set because the source of the input distribution and the feedback of the output result need to be considered during the training of the generator.
Step 2: an optimization target is formulated: the challenge training is essentially a zero-sum game of the largest and smallest strategy, and the model requires that the discriminator be able to distinguish between the sources of input data. Therefore, when the input data is a real sample, the larger D (x) is better, and when the input data is a generation sample, the larger 1- (D (g (z)) is better, that is, the larger the discriminator loss is, the better the discriminator loss is, and in the same way, the smaller the generator loss is, the better the generator loss is, and then the final training optimization target is as shown in formula 3:
Figure BDA0001906015890000131
in the formula
Figure BDA0001906015890000132
Represents an optimization function and an optimization direction, wherein Pdata(x) Representing the distribution of all real samples x, i.e. the distribution of all remote sensing image slices x; pz(z) represents the prior distribution of the input image, i.e., the image slice sample z; e represents the loss function of the whole data distribution in the training process, and the loss function is shown in formulas 4 and 5:
Figure BDA0001906015890000133
Figure BDA0001906015890000134
in the formula, Pg(x) Representing the distribution of the generated samples, and according to the final purpose of training, the theoretically optimal discriminator can be deduced as shown in the formula 6:
Figure BDA0001906015890000135
the expression x is derived from the generated sample Pg(x) And the real sample Pdata(x) The relative proportion of the possibilities, ideally the discriminator should not be able to discriminate between the samples generated by the generator, i.e. Pg(x) And Pdata(x) Approximately equal, this represents a good generation effect of the generator, which is generally called nash equilibrium, and is referred to as D (g (z)) ≈ 0.5.
And step 3: the loss function is modified according to the WGAN model: in the calculation process of the loss function, a KL divergence or JS divergence calculation formula needs to be applied to evaluate the sample Pg(x) And the real sample Pdata(x) And some of the training difficulties in the conventional GAN model result from the defects of the two divergence calculation formulas: they lead to Pg(x) And Pdata(x) When the two distributions are in a small overlapping area, the gradient of the loss function of the generator is fixed to be a constant value, and further the subsequent training process is influenced. The present invention therefore replaces these two divergences by introducing a Wasserstein distance concept, as shown in equation 7:
Figure BDA0001906015890000136
in the formula, Pdata(x) ' represents Pg(x) And Pdata(x) The combined all possible joint distribution sets can acquire a real sample x and a generated sample G (z) by sampling (x, G (z)) gamma once, and the distance between the pair of samples is calculated through | | | x-G (z) | | to obtain an expected value E(x,G(z))~γ[||x-G(z)||]By taking a lower bound on the expected value, the final value is obtainedWasserstein distance W (P)data(x)',Pg(x) Wasserstein distance can still smoothly reflect the gap between each other when the two distributions do not overlap, relative to KL and JS divergence.
In order to apply the Wasserstein distance to the computation of the loss function of two models, it is derived that a discriminator network f is constructed, which contains a parameter w and has no nonlinear activation layer i in the last layerwThe modified discriminator and generator loss functions are shown in formulas 8 and 9 by substituting the original model and limiting w not to exceed a certain specific range. The sigmoid layer is a function which is responsible for mapping input of the neuron to output end on the artificial neural network neuron, and the final true and false binary problems are converted into regression calculation problems by removing the layer in WGAN:
Figure BDA0001906015890000141
Figure BDA0001906015890000142
and 4, step 4: forward propagation computation loss: and inputting the labeled sample training set into the network, and calculating the loss of the generator and the discriminator according to equations 8 and 9 respectively.
And 5: back propagation optimization network: and carrying out BP back propagation according to the forward propagation loss, respectively training the generator and the discriminator alternately, and optimizing the network parameters.
Step 6: repeating the training until the optimization goal is reached: and (5) repeating the training processes in the steps (4) and (5) until the optimization result is close to Nash equilibrium or the gradient of the model loss function is not changed any more, at the moment, the discriminator cannot discriminate the truth of the data sample output by the generator, so that the generator capable of generating the high-precision road elements from the remote sensing image is obtained, and the test effect is shown in fig. 9. The road features extracted by the generator substantially conform to the real road conditions.
The artificial intelligence technology such as deep learning based on the invention uses multilayer network to learn the characteristics of road elements in different layers of the remote sensing image, and fuses multilayer processing into internal model optimization, and the essence of the technology is in accordance with the basic idea of the road extraction method. It can be considered that the deep learning technique, which is applied to road feature extraction, is a feature extraction method that combines an object-oriented extraction method with a pixel-based feature extraction method. The technology reduces the link of manual intervention and makes full use of various elements in the high-resolution image. On the premise of supporting enough samples, the feature extraction model provided by the invention can obtain stronger migration application capability, and well overcomes the defects of the traditional method in image preprocessing and identification accuracy.
Those of ordinary skill in the art will understand that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A method for extracting road features in a high-resolution image is characterized by comprising the following steps:
the method comprises the steps of firstly, obtaining a remote sensing image, and slicing the remote sensing image according to a fixed size to obtain a remote sensing image slice x;
secondly, inputting the remote sensing image slices into a trained generator G for forward propagation operation to obtain road characteristics therein, and outputting the road characteristics;
in the second step, the generator G is obtained by training according to the following steps:
step s1, slicing the sample image according to a fixed size to obtain image slice samples z, and marking road elements in each image slice sample z;
step s2, constructing a generator G and a discriminator D, and initializing to generate an antagonistic network V (D, G); wherein the generator G is a residual error network, the discriminator D is a convolution network, and the loss function for constructing the generator G is
Figure FDA0003000258710000015
Constructing a loss function of the discriminator D as- ((1-t) log (1- (D (G (z)) + ylogD (x))), wherein t-1 represents the input of a remote sensing image slice, and t-0 represents the input of an image slice sample;
Figure 1
representing the output result of the rounded convolution network; g (z) represents a road element distribution sample generated from the original image feature, D (g (z)) represents a result obtained by inputting the road element distribution sample generated from the original image feature into the discriminator, and D (x) represents a result obtained by inputting the real road element distribution sample into the discriminator;
step s3, making an optimization goal
Figure FDA0003000258710000011
Wherein the content of the first and second substances,
Figure FDA0003000258710000012
to optimize the function, Pdata(x) Representing the distribution of all remote sensing image slices x as a true sample, Pz(z) represents the prior distribution of the image slice sample z, E represents the loss function of the overall data distribution in the training process, then
Figure FDA0003000258710000013
Figure FDA0003000258710000014
Wherein, Pg(x) A distribution of generated samples obtained for the generator G;
step s4, evaluating the generated sample P using the Wasserstein distanceg(x) And the real sample Pdata(x) A difference of (a); wherein, the Wasserstein distance W (P)data(x)′,Pg(x) Is prepared from
Figure FDA0003000258710000021
Wherein, Pdata(x) ' represents Pg(x) And Pdata(x) All possible combined distribution sets are combined, single sampling (x, G (z)) gamma obtains a real sample x and a generated sample G (z), | | x-G (z) | | is the distance between the real sample x and the generated sample G (z), E(x,G(z))~γ[||x-G(z)||]An expected value for the distance between the real sample x and the generated sample g (z);
step s5, the marked image slice sample is input into the generator G, based on
Figure FDA0003000258710000022
Calculating the loss of the generator according to
Figure FDA0003000258710000023
Calculating a loss of the discriminator;
step s6, performing BP back propagation operation on the loss obtained by performing forward propagation operation on the generator G in step s 5; respectively training the generator and the discriminator alternately to optimize network parameters;
and step s7, repeating the steps s5 to s6, training the generator and the discriminator, optimizing the network parameters until the generator G and the discriminator D reach Nash equilibrium or the loss of the generator G and the discriminator D is unchanged, and outputting the generator G at the moment as the trained generator G.
2. The method according to claim 1, wherein in step s7, the discriminator D reaches Nash equilibrium D (G (z)) 0.5.
3. The method for extracting road features in high-resolution images as claimed in claim 1, wherein in the second step, the generator G, the discriminator D and the generation countermeasure network V (D, G) are respectively constructed with different parameters and structures, and the training from the step s2 to the step s7 is performed to obtain different generators G; and then selecting a group of generators with superiority from the different generators G, setting different weights according to the parameters and the structures of the generators, fusing and recombining the generators according to the weights of the generators, taking a final generator G 'formed after fusion as a trained generator G, and carrying out forward propagation operation by using the final generator G' to obtain the road characteristics.
4. The method as claimed in claim 3, wherein the discriminator D is in the form of a discriminator
Figure FDA0003000258710000031
5. The method as claimed in claim 3, wherein the step s1 of labeling the road elements in the image slice sample z specifically comprises: the serial number of the road element, the width of the road element, the material of the road element and the environment to which the road element belongs.
6. The method according to claim 3, wherein in step s1, the proportion of positive and negative samples in the image slice sample z is close.
7. The method for extracting road features from high-resolution images according to claim 3, wherein neither the image slice sample z nor the remote sensing image slice is color homogenized.
8. The method for extracting road features in high-resolution images according to claim 7, wherein the image slice sample z or the remote sensing image slice is a binary image with equal size.
CN201811532429.XA 2018-12-14 2018-12-14 Method for extracting road characteristics in high-resolution image Active CN109635748B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811532429.XA CN109635748B (en) 2018-12-14 2018-12-14 Method for extracting road characteristics in high-resolution image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811532429.XA CN109635748B (en) 2018-12-14 2018-12-14 Method for extracting road characteristics in high-resolution image

Publications (2)

Publication Number Publication Date
CN109635748A CN109635748A (en) 2019-04-16
CN109635748B true CN109635748B (en) 2021-09-03

Family

ID=66074043

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811532429.XA Active CN109635748B (en) 2018-12-14 2018-12-14 Method for extracting road characteristics in high-resolution image

Country Status (1)

Country Link
CN (1) CN109635748B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110263612A (en) * 2019-04-25 2019-09-20 北京工业大学 Based on the multi-spectral remote sensing image method for extracting roads for generating confrontation network
CN110211046B (en) * 2019-06-03 2023-07-14 重庆邮电大学 Remote sensing image fusion method, system and terminal based on generation countermeasure network
CN110598673A (en) * 2019-09-24 2019-12-20 电子科技大学 Remote sensing image road extraction method based on residual error network
CN111192221B (en) * 2020-01-07 2024-04-16 中南大学 Aluminum electrolysis fire hole image repairing method based on deep convolution generation countermeasure network
CN112580721B (en) * 2020-12-19 2023-10-24 北京联合大学 Target key point detection method based on multi-resolution feature fusion
CN112634169B (en) * 2020-12-30 2024-02-27 成都星时代宇航科技有限公司 Remote sensing image color homogenizing method and device
CN112906459A (en) * 2021-01-11 2021-06-04 甘肃省公路局 Road network checking technology based on high-resolution remote sensing image and deep learning method
CN112733756B (en) * 2021-01-15 2023-01-20 成都大学 Remote sensing image semantic segmentation method based on W divergence countermeasure network
CN112734849B (en) * 2021-01-18 2022-06-21 上海市城市建设设计研究总院(集团)有限公司 Computer-based urban road network intersection angle detection method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563510A (en) * 2017-08-14 2018-01-09 华南理工大学 A kind of WGAN model methods based on depth convolutional neural networks
CN107909621A (en) * 2017-11-16 2018-04-13 深圳市唯特视科技有限公司 It is a kind of based on it is twin into confrontation network medical image synthetic method
CN108763857A (en) * 2018-05-29 2018-11-06 浙江工业大学 A kind of process soft-measuring modeling method generating confrontation network based on similarity
CN108805188A (en) * 2018-05-29 2018-11-13 徐州工程学院 A kind of feature based recalibration generates the image classification method of confrontation network
CN108830209A (en) * 2018-06-08 2018-11-16 西安电子科技大学 Based on the remote sensing images method for extracting roads for generating confrontation network
CN108922518A (en) * 2018-07-18 2018-11-30 苏州思必驰信息科技有限公司 voice data amplification method and system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8704653B2 (en) * 2009-04-02 2014-04-22 GM Global Technology Operations LLC Enhanced road vision on full windshield head-up display
EP2889327B1 (en) * 2013-11-27 2016-09-21 The Goodyear Tire & Rubber Company Rubber composition for use in a tread of a pneumatic tire
US10474929B2 (en) * 2017-04-25 2019-11-12 Nec Corporation Cyclic generative adversarial network for unsupervised cross-domain image generation
CN108596141B (en) * 2018-05-08 2022-05-17 深圳大学 Detection method and system for generating face image by deep network
CN108764173B (en) * 2018-05-31 2021-09-03 西安电子科技大学 Hyperspectral image classification method based on multi-class generation countermeasure network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563510A (en) * 2017-08-14 2018-01-09 华南理工大学 A kind of WGAN model methods based on depth convolutional neural networks
CN107909621A (en) * 2017-11-16 2018-04-13 深圳市唯特视科技有限公司 It is a kind of based on it is twin into confrontation network medical image synthetic method
CN108763857A (en) * 2018-05-29 2018-11-06 浙江工业大学 A kind of process soft-measuring modeling method generating confrontation network based on similarity
CN108805188A (en) * 2018-05-29 2018-11-13 徐州工程学院 A kind of feature based recalibration generates the image classification method of confrontation network
CN108830209A (en) * 2018-06-08 2018-11-16 西安电子科技大学 Based on the remote sensing images method for extracting roads for generating confrontation network
CN108922518A (en) * 2018-07-18 2018-11-30 苏州思必驰信息科技有限公司 voice data amplification method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Wasserstein GAN;Martin Arjovsky.et.;《ArXiv 2017》;20170131;第1-30页 *
基于深度学习的图像语义分割研究;肖旭;《中国优秀硕士学位论文全文数据库 信息科技辑》;20180115(第1期);第I138-1011页 *
高分辨率遥感影像道路提取方法研究;王双;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;20140715(第7期);第C034-358页 *

Also Published As

Publication number Publication date
CN109635748A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
CN109635748B (en) Method for extracting road characteristics in high-resolution image
CN110705457B (en) Remote sensing image building change detection method
CN110619282B (en) Automatic extraction method for unmanned aerial vehicle orthoscopic image building
CN108920481B (en) Road network reconstruction method and system based on mobile phone positioning data
CN109493346B (en) Stomach cancer pathological section image segmentation method and device based on multiple losses
CN107545538B (en) Panoramic image splicing method and device based on unmanned aerial vehicle
CN110163213B (en) Remote sensing image segmentation method based on disparity map and multi-scale depth network model
CN109948547A (en) Urban green space landscape evaluation method, device, storage medium and terminal device
CN105825235A (en) Image identification method based on deep learning of multiple characteristic graphs
Li et al. Automatic bridge crack identification from concrete surface using ResNeXt with postprocessing
Zhang et al. Efficient inductive vision transformer for oriented object detection in remote sensing imagery
CN108573222A (en) The pedestrian image occlusion detection method for generating network is fought based on cycle
CN113505842B (en) Automatic urban building extraction method suitable for large-scale regional remote sensing image
CN109753853A (en) One kind being completed at the same time pedestrian detection and pedestrian knows method for distinguishing again
CN114092697A (en) Building facade semantic segmentation method with attention fused with global and local depth features
CN114092769A (en) Transformer substation multi-scene inspection analysis method based on federal learning
CN111599007B (en) Smart city CIM road mapping method based on unmanned aerial vehicle aerial photography
CN116539004A (en) Communication line engineering investigation design method and system adopting unmanned aerial vehicle mapping
CN110378047A (en) A kind of Longspan Bridge topology ambiguity three-dimensional rebuilding method based on computer vision
CN110087041A (en) Video data processing and transmission method and system based on the base station 5G
Li et al. Deep imitation learning for traffic signal control and operations based on graph convolutional neural networks
CN113902830B (en) Method for generating track road network
CN111627103A (en) Smart city CIM imaging method based on pedestrian activity and density perception
Yang et al. PDNet: Improved YOLOv5 nondeformable disease detection network for asphalt pavement
CN114581307A (en) Multi-image stitching method, system, device and medium for target tracking identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant