CN110807376A

CN110807376A - Method and device for extracting urban road based on remote sensing image

Info

Publication number: CN110807376A
Application number: CN201910988983.7A
Authority: CN
Inventors: 徐其志; 郭梦瑶; 张帆
Original assignee: Beijing University of Chemical Technology
Current assignee: Beijing University of Chemical Technology
Priority date: 2019-10-17
Filing date: 2019-10-17
Publication date: 2020-02-18

Abstract

The application provides an urban road extraction method and device based on remote sensing images, the method comprises the steps of obtaining GIS image information from a digital map, and generating marking data and training/testing data; constructing an initial road extraction network model based on a U-Net network; training the initial network model by using the marking data and the training/testing data to obtain a road extraction model with road recognition capability; and detecting the remote sensing image by using the road extraction model, and automatically extracting the road target. According to the method and the device, the accuracy of extracting the urban and outdoor roads by using the remote sensing image is improved by constructing the improved U-Net network.

Description

Method and device for extracting urban road based on remote sensing image

Technical Field

The application relates to the technical field of remote sensing information processing, in particular to a remote sensing image processing technology, and especially relates to an urban and outdoor road information extraction method and device based on remote sensing images.

Background

In recent years, the remote sensing technology in China is developed rapidly, and the remote sensing technology is widely applied to the fields of urban and outdoor planning, resource exploitation, military, water conservancy, pollution control, emergency disaster relief and the like. For example, during war, some highway sections can be used as highway airports, and the significance of mastering the highway airports for air force battles is achieved. For another example, in emergency disaster relief, especially in rural areas, finding a suitable road can improve the efficiency of search and rescue and reduce resource waste.

Although the navigation and digital map technology in China has been developed more maturely, the speed of map updating is usually lagged behind the development progress of towns and villages, so it is very important to accurately extract road information.

The highway is divided into five levels according to the use tasks, functions and flow of the highway, and the highway is divided into national road, provincial road, county road and the like according to the administrative level. In addition, in addition to the above-mentioned hierarchical roads, there are many roads which are not strictly classified by national level, such as rural lanes, forest lanes, and naturally occurring lanes in rural areas, and these roads are collectively called as urban roads in order to be distinguished from urban roads.

Compared with urban roads, the urban roads have the problem of missed detection in road extraction due to more levels and different scales of the roads. The extraction result is incomplete due to the fact that part of roads in the mountainous area can pass through the tunnel, and the detection result is discontinuous due to the fact that trees or shadows are shielded on the roads in the forest area. The characteristics of part of the 'dirt roads' in the rural areas are seriously lost along with the gradual deepening of the network depth due to the long and narrow shapes of the 'dirt roads'; in more remote areas, the proportion of roads is smaller, which causes the network to be locally optimal and affects the detection result.

A great number of road extraction algorithms have been proposed by many researchers to date, and the research is more extensive by extracting texture, boundary, pixel and other information as the basis of image segmentation. But these methods suffer from image resolution and scale. With the rise of artificial intelligence in recent years, the U-net model is used for image segmentation, and even the depth of the original U-net network is deepened by combining the U-net and the Res-net, so that a better extraction effect is obtained. However, in general, these techniques have not been able to achieve fully satisfactory results when extracted for urban and off-road roads.

Disclosure of Invention

In view of this, the application provides an urban road extraction method and device based on remote sensing images, so as to improve the road extraction accuracy.

In order to achieve the above purpose, the following technical solutions are adopted in the present application:

according to a first aspect of the present application, a method for constructing a road extraction model includes:

acquiring GIS image information from the digital map, wherein the GIS image information comprises a remote sensing image reflecting the real ground feature condition and a road pure map marking the ground feature information;

cutting the remote sensing image and the road pure image in the same geographical position into small square images, screening, and deleting the corresponding wrong images of the road pure image and the remote sensing image;

acquiring marking data based on the screened road pure map; acquiring road experience knowledge as training/testing data based on the screened remote sensing image;

constructing an initial road extraction network model based on a U-Net network;

and training the initial road extraction model by using the marking data and the training/testing data, and storing model parameters to obtain the road extraction model with the road recognition capability.

In another embodiment, the remote sensing image is google image, and the road pure map is a hundred-degree road pure map.

In another embodiment, the method for acquiring the road experience knowledge based on the screened remote sensing images comprises the following steps:

extracting LBP characteristics of an original image to obtain a first characteristic diagram;

filtering an original image, extracting image features by using a sobel operator, and then obtaining a second feature map through closed operation; and

and superposing the first characteristic diagram and the second characteristic diagram with the original image to obtain training/testing data.

In another embodiment, the initial road extraction network model comprises a first convolution module, a coding part, a decoding part and a classifier which are connected in sequence, wherein the coding part comprises four stages of residual error network modules, and the decoding part comprises three stages of convolution modules.

In another embodiment, the first convolution module, the encoding part and the decoding part include a bridge structure, a slant downward connection structure and a slant upward connection structure therebetween, and the three-stage convolution module of the decoding part receives images of different scales from the first convolution module and the encoding part through the bridge structure, the slant downward connection structure and the slant upward connection structure, respectively.

In another embodiment, the feature maps obtained by the three-level convolution module are respectively input into the pyramid pooling modules, and the feature maps output by the pyramid pooling modules are up-sampled to the size of the original image, and are output to the classifier after being overlapped with the original image.

In another embodiment, the loss value is calculated using a Focalloss function when performing model training.

According to a second aspect of the application, a road extraction method based on remote sensing images comprises the following steps:

acquiring a remote sensing image to be detected;

the remote sensing image is detected by using the road extraction model obtained by the road extraction model construction method according to any one of the first aspect, and the road target is automatically extracted.

According to a third aspect of the present application, a road extraction device based on a remote sensing image comprises:

the image acquisition equipment is used for acquiring a remote sensing image to be detected;

a memory having computer instructions stored therein;

and the processor is in data connection with the image acquisition equipment and the memory, and executes the computer instructions to execute the road extraction method according to the second aspect so as to automatically extract the road target in the remote sensing image.

According to a fourth aspect of the present application, a computer-readable storage medium stores computer instructions which, when executed by a processor, implement the road extraction model construction method according to the first aspect or the road extraction method according to the second aspect.

Due to the adoption of the scheme, the method has the following technical effects:

(1) the method adopts knowledge driving and the U-net model to be combined for extracting the urban road and the off-urban road, integrates the ideas of the U-net model and the Resnet model, deepens the network depth and improves the extraction precision of the urban road and the off-urban road;

(2) the method and the device make full use of the multi-scale context information, effectively avoid the loss of detail information, and can be used for extracting the urban and outdoor roads with different widths;

(3) the method and the device make full use of the information such as the texture and the pixel value, have simple algorithm and are easy to apply in engineering.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is a schematic flow chart of a road extraction model construction method according to a first embodiment of the present application;

FIG. 2 is a schematic flow chart diagram of data annotation and extraction of empirical knowledge according to an embodiment of the present application;

FIG. 3 is an improved U-Net network model structure according to an embodiment of the present application;

FIG. 4 illustrates a structure of a residual module and a convolution module according to an embodiment of the present application;

FIG. 5 is a pooling pyramid pooling module structure according to an embodiment of the present application;

FIG. 6 is a diagram illustrating an example of extraurban road extraction using the method provided by the embodiment of the present application; wherein, (a) is a remote sensing image, (b) is a road pure image, and (c) is an extracted road image;

FIG. 7 is a schematic flow chart of a method for remote sensing image-based road extraction according to a second embodiment of the present application;

fig. 8 is a schematic structural diagram of a remote sensing image-based road extraction device according to a third embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Referring to fig. 1, fig. 1 shows a schematic flow of a road extraction model construction method according to a first embodiment of the present application, which is particularly used for road extraction of urban roads, such as suburbs, rural areas, mountainous areas, and the like. As shown in the figure, the road extraction model construction method comprises the following steps:

step 101, acquiring GIS image information from a digital map.

According to an alternative embodiment of the present application, for example, GIS image information including remote sensing images reflecting real ground feature conditions, such as google images, and road purity maps labeling ground feature information, such as hundredth road purity maps, may be downloaded from a new earth map (Loca Space Viewer) software. Wherein, the resolution of the data of the rural area can be selected to be 1.022 m/pixel.

According to the method and the device, GIS image information is acquired from the digital map and serves as model training data, a large number of high-quality labeling samples can be generated rapidly, the problems that manual labeling wastes time and energy and the number of the high-quality labeling samples is small are solved, and reliable samples are provided for model training.

And 102, cutting the remote sensing image and the road pure image in the same geographical position into small square images, and screening.

According to an alternative embodiment of the present application, Matlab software may be used to cut the downloaded google image (as shown in fig. 6 (a)) and road pure map (as shown in fig. 6(b)) of the same geographic location into n × n small square maps, for example. And the cut small images are screened, and wrong images corresponding to the road pure image and the Google image are deleted, so that the accuracy of the sample is ensured.

103, acquiring the labeling data based on the screened road pure map

Referring to fig. 2, for the road pure map, an appropriate threshold is set according to the threshold range, pixels satisfying the threshold condition and pixels in the four neighboring regions thereof are marked as roads, the strip-shaped features are extracted, a binarized road labeling map (the style of which is shown in fig. 6(c)) is prepared, and high-quality labeling samples are rapidly generated.

And 104, acquiring road experience knowledge as training/testing data based on the screened remote sensing images, wherein the road experience knowledge comprises texture information, pixel value information, topological information and the like of the images.

Continuing to refer to fig. 2, based on the filtered google images, the following steps are performed to obtain the road experience knowledge:

1041, extracting LBP characteristics of an original image to obtain a first characteristic diagram;

the LBP operator is a local texture extraction method, a window central pixel is used as a threshold value, if a peripheral pixel is larger than the central pixel, the central pixel is marked as 1, and if not, the central pixel is marked as 0. And comparing the adjacent 8 pixels, generating 8-bit binary codes according to the clockwise sequence, and converting the binary codes into decimal codes to obtain the LBP value of the central pixel point so as to reflect texture information.

The first feature map generated through LBP conversion can well reflect and extract the texture information of the banded target of the road, and guarantee is provided for further mining road features in the network.

1042, filtering the original image, extracting image features by using a sobel operator, and then obtaining a second feature map through closed operation;

according to an optional embodiment of the present application, the filtering algorithm is mean shift filtering, and the method performs smooth filtering on an image on a color level, but retains prominent texture information.

The sobel operator is an image edge detection operator. The operator comprises two groups of 3 x3 matrixes which are respectively in the transverse direction and the longitudinal direction, and the two matrixes are respectively convolved with the image to respectively obtain the brightness difference approximate values in the transverse direction and the longitudinal direction. The convolution formula is:

wherein G is_xAnd G_yPictures representing the detection of the lateral and longitudinal edges, respectively.

The gradient value of each pixel of the image may be calculated using the following formula:

the above formula can also be simplified as:

|G|＝|G_x|+|G_y|

the closed operation is a morphological operation and comprises two steps, namely, firstly performing expansion operation and then performing corrosion operation. And the method also comprises an opening operation which comprises the steps of firstly performing corrosion operation and then performing expansion operation.

The dilation operation incorporates the background points contacted by the target area into the target and expands the target to the outside, and the calculation formula is as follows:

this formula represents expanding a with structure B, translating the origin of structure element B to the image pixel (x, y) location. If the intersection of B and A at the image pixel (x, y) is not empty (that is, at least one of the image values corresponding to A at the element position of B being 1 is 1), the pixel (x, y) corresponding to the output image is assigned to 1, otherwise, the pixel is assigned to 0.

The erosion operation causes shrinkage of the target boundary, which can eliminate meaningless targets. The formula of the algorithm is expressed as:

this formula represents etching A with structure B. The pixel (x, y) corresponding to the output image is assigned a value of 1 if B is at (x, y) completely contained in the area where image a overlaps, (i.e. the corresponding a image values at the element positions of 1 in B are all also 1), and is assigned a value of 0 otherwise.

The second characteristic diagram obtained through filtering and sobel operator operation weakens the tree texture and other non-obvious textures, highlights the edge information of the road, serves as auxiliary information, is beneficial to reducing the probability that some secondary texture information is mistaken as the road, and improves the segmentation precision.

And 1043, overlapping the first feature map, the second feature map and the original image to obtain training/testing data.

The first characteristic diagram, the second characteristic diagram and the RGB three channels of the original image are overlapped to obtain road experience knowledge which is used as training/testing data.

By overlapping the original image with the first and second feature maps, the original image, the texture, the edge and other information can be fused, and the performance of the model can be further improved.

According to an optional embodiment of the present application, data expansion is further performed on the labeled data and the training/testing data, for example, the image and the labeled graph thereof are sequentially rotated clockwise by 90 °, 180 °, 270 °, and are rotated once in the horizontal and vertical directions. The data volume can be further enlarged through data expansion, and meanwhile, the problem of over-training fitting can be effectively prevented.

It can be understood by those skilled in the art that the above step 103 and step 104 do not limit the execution sequence, and the step 103 and the step may be executed sequentially, for example, the step 103 is executed first, the above sequence may also be changed when the step 104 is executed, or the step 103 and the step 104 may also be executed synchronously.

Step 105, constructing a road extraction model based on the U-net network

A conventional U-net network comprises two parts, namely a feature extraction part and an upsampling part. And the feature extraction part has one scale after passing through each pooling layer, and the original image scale comprises 5 scales. In the up-sampling part, every time the up-sampling is carried out, the up-sampling part is spliced with the same scale of the channel corresponding to the feature extraction part.

As shown in fig. 3, the road extraction model in the present application employs a modified U-net network that retains the encoding portion C, decoding portion D, intermediate bridge structure L0 between the encoding and decoding portions, and the classifier of the conventional U-net model. The image inputted to the network is first passed through a convolution module X1, then inputted to the encoding section C for image feature extraction, and then passed through the decoding section D to restore the image size step by step. The bridge structure L0 fuses the encoded part and decoded part information of the same scale to reduce information loss.

As shown in fig. 4, the encoding and decoding portions of the present application are each composed of BN (batch normalization), ReLU (activation function), Conv (convolution operation).

In the improved U-Net network, the coding part C adopts a residual network to replace a feature extraction part in the traditional U-Net network, and each residual network module forms a main part of the improved U-Net downsampling. According to an embodiment herein, the coded portion C comprises a four-level residual network X2-X5, which may be, for example, a Resnet residual network. By adopting the residual error network, the loss of information in the coding process can be reduced, and the network depth is increased.

As shown in fig. 4(a), each residual network module used in the present application is composed of two branches, a first branch is a main branch and is composed of BN, ReLU, and Conv, and two convolutions are performed; the second branch is a residual branch, and the two branches are superposed at the output end. Suppose the output is x_l+1Input is x_lF is a function of the first branch, and its expression is X_l+1＝F(x_l)+x_l. The mode of the jump originated from Resnet can slow down the decay speed and improve the learning rate.

The decoding part D includes three stages of convolution modules Y1-Y3. Fig. 4(b) shows the structure of each upsampling convolution module of the present application.

Further, the improved U-net network is additionally provided with oblique connection structures including an oblique downward connection structure L1 and an oblique upward connection structure L2 on the basis of the traditional U-net network, so that the network can fully utilize context information of images with different scales.

Referring to fig. 3, the convolution module Y3 is connected to the convolution module X1 through a lower connection structure L1 and to the residual network X3 through an upper connection structure L2, in addition to the residual network X2 through a bridge structure L0. Similarly, the convolution module Y2 is connected to the convolution module X2 via the lower connection structure L1 and to the residual network X4 via the upper connection structure L2, in addition to being connected to the residual network X3 via the bridge structure L0, and the convolution module Y1 is connected to the residual network X4 via the bridge structure L0, to the convolution module X3 via the lower connection structure L1 and to the residual network X5 via the upper connection structure L2.

The obliquely upward connecting structure comprises image up-sampling and image fusion, and aims to sample the image with the size smaller than the size of the image with the corresponding scale to the same size and then perform image fusion.

The upsampling may be implemented, for example, by:

I_k(x+u,y+v)＝(1-u)(1-v)I_k(x,y)+uvI_k(x+1,y+1)+v(1-u)I_k(x,y+1)+u(1-v)I_k(x+1,y)

wherein, the pixel point I_k(x + u, y + v) is the k channel pixel point I of the picture_k(x,y)、I_k(x+1,y)、I_k(x, y +1) and I_kAn interpolation point, 0, between (x +1, y +1)<u<1，0<v<1, x is the row number of the image and y is the column number of the image.

The oblique downward connection structure comprises image down-sampling and image fusion, and aims to sample the image with the size larger than the size of the image with the corresponding scale to the same size and then perform image fusion. For example, for a size of

Further, the improved U-net network adds a pyramid pooling portion P over the traditional U-net network. According to an embodiment of the present application, the image of the pyramid M × N is sampled s times, so as to obtain an image with a size of (M/s) × (N/s).

According to the method and the device, the upper connecting structure and the lower connecting structure are added in the model, so that the characteristic reuse can be realized, the propagation of characteristics among different layers is enhanced, the information of small country roads is prevented from disappearing along with the deepening of the network, and the problem of disappearing of gradients is relieved.

The pooling part P comprises pyramid pooling modules Z1-Z3 respectively corresponding to convolution modules Y1-Y3 of each stage constituting the decoding part D, and feature maps obtained by the convolution modules Y1-Y3 of each stage are respectively input into pyramid pooling modules Z1-Z3. And sampling the feature map output by each pooling pyramid module to the size of the original image, overlapping the feature map with the input original image, and outputting the feature map to a classifier, so as to obtain a prediction result of each pixel through the classifier.

As shown in fig. 5, the pyramid pooling module first reduces the image averaging pooling operation to 5 different scales (including 1 × 1, 2 × 2, 3 × 3, 5 × 5, 7 × 7), then performs a convolution operation on the obtained multiple scale feature maps, compresses them into single-channel feature maps, and finally upsamples the feature maps to the same size as the input original image and superimposes the feature maps with the original image.

Because the feature maps of different scales contain information of different receptive fields (the size of an area mapped by one pixel in the feature map on an input image), rich context information can be fully acquired by adding a pyramid pooling part, and image classification constraint of a fixed size is removed. Unlike urban roads, extraurban roads are characterized by sparse distribution and the same area may contain roads of different levels. The pyramid pooling module divides the characteristic diagram into sub-areas with different sizes by using pooling windows with different sizes, information under different receptive fields can be extracted from the sub-areas with different sizes, and the information is overlapped and fused by sampling to a uniform size, so that fragmentization of a road extraction result can be effectively avoided, and the problem that sparse and fine roads can not be accurately extracted is solved.

And 106, training the road extraction model by using the marking data and the training/testing data to obtain a model with the road recognition capability.

Samples are selected from the label data and the training/testing data (as shown in fig. 6(a) and (b)) obtained in the above steps as training sets and input into the network model shown in fig. 3, and the network model is used for learning to obtain a network model with initialization parameters. And performing road extraction prediction on the verification set image by using the network model to obtain a road extraction prediction image. Through comparing the difference between the prediction result (as shown in fig. 6(c)) and the road truth value image (as shown in fig. 6(b)), the loss value is calculated through the loss function, the prediction error loss value is transmitted back to the network, gradient feedback is carried out, and each module unit parameter in the network is corrected. And through the cyclic calculation for the preset times, the final prediction error is within a set threshold range, the accuracy of the prediction result is within an expected range, and the network model with the capability of accurately extracting the road is obtained.

After training, the computer stores the model parameters, and then the model can be downloaded for further training, or the model parameters can be downloaded to predict the road information in the image.

From the prior knowledge, the road target is a sparse target, and particularly for rural areas, the road target is more sparse. The road object is a positive sample, and the area of the road object is far smaller than that of the background object, namely a negative sample. The effect of the loss value of a large number of samples on the final function overwhelms the effect of a small number of samples, where the loss function is slow and may not be optimized to the optimum during the iteration of a large number of simple samples. To this end, the present application introduces a Focalloss function to calculate the loss value. The calculation formula for Focalloss is as follows:

the modulation factor gamma is a number which is larger than 0, the influence of easily classified samples on the calculation of the loss value can be reduced, the modulation factor gamma is a cross entropy loss function when the gamma is 0, the influence of an adjusting factor is increased when the gamma is increased, and the balance variable α is a number between 0 and 1, so that the proportion of balance positive and negative samples is uneven.

The Focalloss function can reduce the loss of samples (namely background targets) which are easy to classify by adopting a parameter adding method on the basis of cross entropy to calculate a loss value, and helps a model to concentrate on the samples which are more difficult to train, namely sparse off-city road targets.

Although the above embodiment describes the road extraction model construction method in the order of steps, those skilled in the art will understand that the steps are not executed in the exact order of steps. For example, the step of constructing the U-Net network-based urban road extraction initial network model may be performed first in the method, or at any time during, before, or after step 101-104.

According to the second embodiment of the application, the invention further provides a road extraction method based on the remote sensing image, and particularly, the automatic extraction of the urban road information is realized by adopting the concept of combining knowledge driving and a U-net model based on the urban remote sensing image, such as the remote sensing image of the suburb, the rural area, the mountain area and the like. As shown in the figure, the road extraction method comprises the following steps:

step 201, obtaining a remote sensing image to be detected. The remote sensing image may be, for example, a satellite remote sensing image, an aerial image, or the like.

And 202, detecting the remote sensing image by using the trained road extraction model, and automatically extracting the target of the off-city road.

Inputting a remote sensing image to be predicted into a network;

before the remote sensing image is formally predicted, a network needs to be built again, the optimal network model parameters which are trained and stored before are loaded, and the trained parameters are transferred to a formal road extraction network.

The remote sensing image to be predicted is input into the network, the remote sensing image to be predicted outputs a binary predicted image through network operation, and road areas divided through the network can be visually seen from the predicted image. The predicted time for each remote sensing image with size 512 x 512 is within 1 second.

The generated model parameter file can be used as an initial parameter for next training, so that the waste of time caused by starting to train parameters from the beginning is well avoided. And finally, a road extraction model with high performance can be trained through massive data.

According to the third embodiment of the application, the invention also provides a road extraction device based on the remote sensing image, and particularly, the road extraction device based on the remote sensing image outside the city can automatically extract the information of the road outside the city. The road extraction means may be implemented by software and/or hardware and are typically integrated in a computer device capable of instantaneous localization and mapping. As shown, the apparatus 300 includes an image acquisition device 301, a memory 302, and a processor 303. The image capturing device 301, the memory 302, and the processor 303 may be connected by a bus or other means.

The image acquisition device 301 is configured to acquire a remote sensing image to be detected, and send the remote sensing image to be detected to the processor 303.

The Processor 303 may be a Central Processing Unit (CPU) or other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof.

The memory 302 is a non-transitory computer readable storage medium, and can be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as programs or instructions corresponding to the road extraction model construction method in the first embodiment of the present application or the road extraction method in the second embodiment of the present application. The processor 303 executes various functional applications of the processor and data processing by running a non-transitory software program or instructions stored in the memory 302, that is, implements the road model construction method or the remote sensing image-based road extraction method in the above-described method embodiments.

The memory 302 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the data storage area may store data created by the processor 302, and the like. Further, the memory 302 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some aspects, the memory 302 optionally includes memory located remotely from the processor 303, and such remote memory may be connected to the processor 303 over a network. Optionally, the network includes, but is not limited to, the internet, an intranet, a local area network, a mobile communications network, and combinations thereof.

Although the present application is described in more detail by the above embodiments, the present application is not limited to the above embodiments, and modifications and equivalents may be made to the technical solutions of the embodiments without departing from the inventive concept of the present application without departing from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. A road extraction model construction method is characterized by comprising the following steps:

constructing an initial road extraction network model based on a U-Net network;

2. The method for constructing the road extraction model according to claim 1, wherein the remote sensing image is a google image, and the road pure map is a Baidu road pure map.

3. The method for constructing a road extraction model according to claim 1, wherein the method for obtaining road experience knowledge based on the screened remote sensing image comprises:

4. The road extraction model construction method according to claim 1, wherein the initial road extraction network model comprises a first convolution module, a coding part, a decoding part and a classifier which are connected in sequence, wherein the coding part comprises a four-stage residual network module, and the decoding part comprises a three-stage convolution module.

5. The road extraction model construction method according to claim 5, wherein the first convolution module, the encoding portion and the decoding portion include a bridge structure, a slant downward connection structure and a slant upward connection structure therebetween, and the three-level convolution module of the decoding portion receives images of different scales from the first convolution module and the encoding portion through the bridge structure, the slant downward connection structure and the slant upward connection structure, respectively.

6. The road extraction model construction method according to claim 4 or 5, wherein the feature maps obtained by the three-level convolution module are respectively input into the pyramid pooling modules, and the feature maps output by the pyramid pooling modules are up-sampled to the size of the original image, and are output to the classifier after being overlapped with the original image.

7. The road extraction model construction method according to any one of claims 1-6, characterized in that the loss value is calculated by using Focalloss function in the model training.

8. A road extraction method based on remote sensing images is characterized by comprising the following steps:

acquiring a remote sensing image to be detected;

the remote sensing image is detected by using the road extraction model obtained by the road extraction model construction method according to any one of claims 1 to 7, and the road target is automatically extracted.

9. A road extraction device based on remote sensing images is characterized by comprising:

a memory having computer instructions stored therein;

a processor, in data communication with the image capture device, memory, and configured to automatically extract the road object in the remotely sensed image by executing the computer instructions to perform the road extraction method of claim 8.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer instructions which, when executed by a processor, implement the road extraction model construction method according to any one of claims 1-7, or the road extraction method according to claim 8.