WO2022062854A1 - 一种图像处理方法、装置、设备和存储介质 - Google Patents
一种图像处理方法、装置、设备和存储介质 Download PDFInfo
- Publication number
- WO2022062854A1 WO2022062854A1 PCT/CN2021/115515 CN2021115515W WO2022062854A1 WO 2022062854 A1 WO2022062854 A1 WO 2022062854A1 CN 2021115515 W CN2021115515 W CN 2021115515W WO 2022062854 A1 WO2022062854 A1 WO 2022062854A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- building
- roof
- edge
- offset
- mentioned
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 23
- 238000003384 imaging method Methods 0.000 title abstract 2
- 238000012545 processing Methods 0.000 claims abstract description 115
- 238000000034 method Methods 0.000 claims abstract description 69
- 238000012549 training Methods 0.000 claims description 41
- 230000008859 change Effects 0.000 claims description 26
- 238000012937 correction Methods 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 13
- 238000002372 labelling Methods 0.000 claims description 12
- 230000009466 transformation Effects 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 230000001131 transforming effect Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 14
- 238000009966 trimming Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/564—Depth or shape recovery from multiple images from contours
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
- G06T2207/10044—Radar image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20088—Trinocular vision calculations; trifocal tensor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
- G06T2207/30132—Masonry; Concrete
Definitions
- the present disclosure relates to the field of computer technologies, and in particular, to an image processing method, apparatus, device, and storage medium.
- the base of the building may be partially occluded in the image, resulting in inconspicuous visual features, which affects the prediction accuracy of the base of the building. .
- the present disclosure discloses at least one image processing method, the method includes: acquiring a target image including a building; performing image processing on the target image to determine the roof area of the building, the side bottom edge of the building and the The top edge of the side, and the offset angle between the roof and the base of the building; according to the offset angle, determine the offset between the bottom edge of the side and the top edge of the side; The roof contour corresponding to the roof area is transformed to obtain the base contour.
- the method further includes: determining the height of the building based on the offset and a predetermined scale between the height of the building and the offset.
- the method further includes: performing image processing on the target image to determine the edge directions corresponding to each pixel included in the roof outline of the building; the method further includes: based on the edge directions,
- the above-mentioned roof outline is subjected to regularization processing to obtain the roof polygon corresponding to the above-mentioned building; the above-mentioned transforming the roof outline corresponding to the above-mentioned roof area according to the above-mentioned offset amount, to obtain the base outline, including: according to the above-mentioned offset amount, the above-mentioned roof
- the polygon is transformed to obtain the above-mentioned base outline.
- the above-mentioned regularization processing is performed on the above-mentioned roof outline based on the above-mentioned edge direction to obtain a roof polygon corresponding to the above-mentioned building, including: using any pixel point among the pixel points included in the above-mentioned roof outline as a target Pixel point, determine the direction difference between the edge direction corresponding to the above-mentioned target pixel point and the edge direction corresponding to the adjacent pixel point of the above-mentioned target pixel point; When the direction difference reaches the first preset threshold, the target pixel point is determined as the vertex of the roof polygon corresponding to the building; based on the determined vertex of the roof polygon, the roof polygon corresponding to the building is obtained.
- the above method further includes: dividing the preset angle to obtain N angle intervals, and assigning identification values to the N angle intervals; wherein, N is a positive integer; the above-mentioned determining that the target pixel corresponds to The direction difference between the edge direction of the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point includes: determining the first angle interval to which the edge direction corresponding to the target pixel point belongs; determining that the adjacent pixel points of the target pixel point correspond to The second angle interval to which the edge direction belongs to; the difference between the identification value corresponding to the above-mentioned first angle interval and the identification value corresponding to the above-mentioned second angle interval is determined as the adjacent edge direction corresponding to the above-mentioned target pixel point and the above-mentioned target pixel point The direction difference between the edge directions corresponding to the pixel points.
- N is a positive integer less than or equal to the second preset threshold.
- the method further includes: correcting the vertices of the roof polygon based on the vertex correction model to obtain a corrected roof polygon; wherein the vertex correction model is a model determined based on a graph neural network.
- determining the offset between the bottom edge of the side surface and the top edge of the side surface according to the offset angle includes: determining that the bottom edge of the side surface moves to the side surface in the direction of the offset angle The position change amount of the top edge, and the above position change amount is the above offset amount.
- determining the position change amount of the bottom edge of the side surface moving to the top edge of the side surface in the direction of the offset angle, and using the position change amount as the offset amount includes: based on the difference between the side surface and the side surface The preset border corresponding to the top edge is cropped to the side top edge probability map corresponding to the above-mentioned side top edge to obtain a first cropping result, and the above-mentioned side top edge probability map includes the above-mentioned target image.
- the probability map is cropped to obtain a plurality of second cropping results, and the above-mentioned side bottom edge probability map includes the image area including the above-mentioned side bottom edge in the above-mentioned target image; among the above-mentioned plurality of second cropping results, it is determined that the above-mentioned first cropping result is the same as the above-mentioned first cropping result.
- the matching target cropping result is obtained, and the position change of the bottom edge of the side when the target cropping result is obtained is determined as the offset.
- the above method further includes: determining the circumscribed frame corresponding to the above-mentioned side top edge as the above-mentioned preset frame; or, a combined edge obtained by combining a plurality of side top edges included in the above-mentioned roof outline
- the corresponding circumscribed frame is determined to be the above-mentioned preset frame.
- image processing is performed on the target image to determine the roof area of the building, the side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building , including: utilize image processing model to carry out image processing to above-mentioned target image, determine the roof area of above-mentioned building, the side bottom edge and side top edge of above-mentioned building, and the offset angle between the roof and base of above-mentioned building;
- the above-mentioned image processing model includes a roof area prediction sub-model for outputting the above-mentioned roof area, a building edge prediction sub-model for outputting the above-mentioned side bottom edge and the above-mentioned side top edge, and a building edge direction prediction for outputting the above-mentioned edge direction. submodel, and an offset angle prediction submodel for outputting the above offset angle.
- the training method of the above image processing model includes: acquiring a plurality of training samples related to buildings and including labeling information; wherein the labeling information includes the roof area and side area of the building, the outline of the building Each edge included, the edge direction corresponding to each pixel included in the building, and the offset angle between the roof and the base of the building; based on the loss information corresponding to each of the sub-models included in the image processing model, construct Joint learning loss information; based on the joint learning loss information, use a plurality of the training samples to jointly train each of the above-mentioned sub-models included in the above-mentioned image processing model until each of the above-mentioned sub-models converges.
- the present disclosure also provides an image processing device, the device comprising: an acquisition module for acquiring a target image including a building; an image processing module for performing image processing on the target image to determine the roof area of the building, the above-mentioned The side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building; the offset determination module is used to determine the side bottom edge and the side top edge according to the offset angle. The offset between them; the transformation module is used to transform the roof contour corresponding to the roof area according to the offset to obtain the base contour.
- the above-mentioned apparatus further includes: a building height determination module, configured to determine the height of the above-mentioned building based on the above-mentioned offset amount and a predetermined scale between the building height and the offset amount.
- a building height determination module configured to determine the height of the above-mentioned building based on the above-mentioned offset amount and a predetermined scale between the building height and the offset amount.
- the above-mentioned apparatus further includes: an edge direction determination module, configured to perform image processing on the above-mentioned target image, and determine the respective edge directions corresponding to each pixel included in the roof outline of the above-mentioned building; the above-mentioned apparatus further includes : a regularization processing module, which performs regularization processing on the above-mentioned roof outline based on the above-mentioned edge direction, so as to obtain a roof polygon corresponding to the above-mentioned building; the above-mentioned transformation module is specifically used for: according to the above-mentioned offset, the above-mentioned roof polygon is transformed, Obtain the base profile above.
- an edge direction determination module configured to perform image processing on the above-mentioned target image, and determine the respective edge directions corresponding to each pixel included in the roof outline of the above-mentioned building
- the above-mentioned apparatus further includes : a regularization processing module, which performs regularization processing on the above-
- the above-mentioned regularization processing module includes: a first determination sub-module, configured to use any pixel point in each pixel point included in the above-mentioned roof outline as a target pixel point, and determine the corresponding pixel point of the above-mentioned target pixel point.
- the second determination sub-module is used for the direction difference between the edge direction corresponding to the above-mentioned target pixel point and the edge direction corresponding to the above-mentioned adjacent pixel points
- the target pixel point is determined as the vertex of the roof polygon corresponding to the building
- the roof polygon determination sub-module obtains the roof polygon corresponding to the building based on the determined vertex of the roof polygon.
- the above apparatus further includes: a dividing module. Dividing the preset angle to obtain N angle intervals, and assigning identification values to the above N angle intervals; wherein, N is a positive integer; the above-mentioned first determination sub-module is specifically used for: determining the edge direction corresponding to the above-mentioned target pixel point the first angle interval to which it belongs; determine the second angle interval to which the edge direction corresponding to the adjacent pixel point of the above-mentioned target pixel point belongs; the difference between the identification value corresponding to the above-mentioned first angle interval and the identification value corresponding to the above-mentioned second angle interval It is determined as the direction difference between the edge direction corresponding to the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point.
- a dividing module Dividing the preset angle to obtain N angle intervals, and assigning identification values to the above N angle intervals; wherein, N is a positive integer; the above-mentioned first determination
- N is a positive integer less than or equal to the second preset threshold.
- the above-mentioned apparatus further includes: a vertex correction module, based on the vertex correction model, to correct the vertices of the above-mentioned roof polygon to obtain a corrected roof polygon; wherein, the above-mentioned vertex correction model is determined based on a graph neural network. 's model.
- the offset determination module includes an offset determination sub-module, configured to determine the position change of the bottom edge of the side surface moving to the top edge of the side surface in the direction of the offset angle, and Let the above-mentioned position change amount be the above-mentioned offset amount.
- the above-mentioned offset determination sub-module is specifically configured to: based on the preset frame corresponding to the above-mentioned side top edge, crop the side top edge probability map corresponding to the above-mentioned side top edge to obtain a first cropping
- the above-mentioned side top edge probability map includes the image area in the above-mentioned target image including the above-mentioned side surface top edge; according to the preset step size and the preset maximum offset, the above-mentioned side bottom edge is moved multiple times according to the above-mentioned offset angle direction , and after each movement, the side bottom edge probability map corresponding to the side bottom edge is cropped based on the preset frame to obtain a plurality of second cropping results, where the side bottom edge probability map includes the target image including the side The image area of the bottom edge; in the above-mentioned multiple second cropping results, determine the target cropping result that matches the above-mentioned first cropping result, and determine
- the above-mentioned device further includes: determining the circumscribed frame corresponding to the above-mentioned side top edge as the above-mentioned preset frame; or, a combined edge obtained by combining a plurality of side top edges included in the above-mentioned roof outline The corresponding circumscribed frame is determined to be the above-mentioned preset frame.
- the above-mentioned image processing module is specifically configured to: use an image processing model to perform image processing on the above-mentioned target image to determine the roof area of the above-mentioned building, the side bottom edge and the side top edge of the above-mentioned building, and The offset angle between the roof and the base of the above-mentioned building;
- the above-mentioned image processing model includes a roof area prediction sub-model for outputting the above-mentioned roof area, and a building edge predictor for outputting the above-mentioned side bottom edge and the above-mentioned side top edge.
- the model is used to output the building edge direction prediction sub-model for the above-mentioned edge direction, and the offset angle prediction sub-model for outputting the above-mentioned offset angle.
- the training device corresponding to the training method of the image processing model includes: a training sample acquisition module for acquiring a plurality of training samples involving buildings and including label information; wherein the label information includes the buildings The roof area and side area of the building, the edges included in the building outline, the edge directions corresponding to the pixels included in the building, and the offset angle between the roof and the base of the building; the loss information determination module is used for Based on the loss information corresponding to each of the above sub-models included in the above image processing model, joint learning loss information is constructed; the joint training module is configured to, based on the above joint learning loss information, use a plurality of the above training samples to The above-mentioned sub-models are jointly trained until each of the above-mentioned sub-models converges.
- the present disclosure also proposes an electronic device, the device comprising: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to call the executable instructions stored in the memory to implement any of the above An image processing method shown in an embodiment. .
- the present disclosure also provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute and implement the image processing method shown in any of the foregoing embodiments.
- the roof area of the building, the side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building are determined from the acquired target image. Then, according to the offset angle, the offset between the bottom edge of the side surface and the top edge of the side surface is determined. Finally, according to the above offset, the roof contour corresponding to the above roof area is transformed to obtain the base contour, so that in the process of building base prediction, there is no need to rely on the base features included in the target image, so that the building included in the target image does not need to be relied on. In the case where the feature of the base of the building is blocked, the base of the building with higher precision can also be obtained.
- FIG. 2 is a schematic flow chart of a base profile prediction shown in the present disclosure
- FIG. 3 is a schematic flowchart of image processing on a target image shown in the present disclosure
- FIG. 5 is a schematic diagram of an offset between a roof and a base shown in the present disclosure
- FIG. 6 is a method flow chart of a method for determining an offset shown in the present disclosure
- FIG. 7 is a schematic diagram of a movement process of a side bottom edge shown in the present disclosure.
- FIG. 8 is a schematic flowchart of a base profile prediction shown in the present disclosure.
- FIG. 9 is a schematic diagram of an edge direction shown in the present disclosure.
- FIG. 10 is a schematic flowchart of image processing on a target image shown in the present disclosure.
- FIG. 11 is a method flow chart of an image processing model training method shown in the present disclosure.
- FIG. 12 is a schematic diagram of an image processing apparatus shown in the present disclosure.
- FIG. 13 is a hardware structure diagram of an electronic device shown in the present disclosure.
- the present disclosure aims to propose an image processing method.
- the method predicts the roof area of the building from the above-mentioned target image and relevant features for determining the offset between the roof and the base of the building, and determines the above-mentioned offset according to the above-mentioned relevant characteristics. After the offset is determined, the roof contour corresponding to the roof area is transformed based on the offset to obtain the base contour, so that in the process of building base prediction, there is no need to rely on the base features included in the target image, so that the In the case where the building base features included in the target image are occluded, a building base with higher accuracy can also be obtained.
- FIG. 1 is a method flowchart of an image processing method shown in the present disclosure. As shown in Figure 1, the above method may include:
- S104 Perform image processing on the target image to determine the roof area of the building, the side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building.
- S106 Determine the offset between the bottom edge of the side surface and the top edge of the side surface according to the offset angle.
- the above-mentioned image processing method can be applied to electronic equipment.
- the above-mentioned electronic device may execute the above-mentioned image processing method by carrying a software system corresponding to the image processing method.
- the type of the above-mentioned electronic device may be a notebook computer, a computer, a server, a mobile phone, a PAD terminal, etc., which is not particularly limited in the present disclosure.
- the above-mentioned image processing method can be executed only by the terminal device or the server device alone, or can be executed by the terminal device and the server device in cooperation.
- the above-mentioned image processing method can be integrated in the client.
- the terminal device equipped with the client After receiving the image processing request, the terminal device equipped with the client can provide computing power through its own hardware environment to execute the above image processing method.
- the above-mentioned image processing method can be integrated into the system platform.
- the server device equipped with the system platform can provide computing power through its own hardware environment to execute the above image processing method.
- the above image processing method can be divided into two tasks: acquiring a target image and processing the target image.
- the acquisition task can be integrated in the client and carried on the terminal device.
- Processing tasks can be integrated on the server and carried on the server device.
- the above terminal device may initiate an image processing request to the above server device after acquiring the target image.
- the server device may execute the method on the target image in response to the request.
- the execution subject is an electronic device (hereinafter referred to as a device) as an example for description.
- FIG. 2 is a schematic diagram of a base profile prediction process shown in the present disclosure.
- the above-mentioned device may execute S104, perform image processing on the above-mentioned target image, and determine the roof area of the above-mentioned building, the side bottom edge and the side top edge of the above-mentioned building, and the above-mentioned building.
- the offset angle between the roof and the base may be executed S104, perform image processing on the above-mentioned target image, and determine the roof area of the above-mentioned building, the side bottom edge and the side top edge of the above-mentioned building, and the above-mentioned building.
- the above-mentioned target image refers to an image including at least one building in the image.
- the above-mentioned target image may be a remote sensing image captured by a device such as an aircraft, an unmanned aerial vehicle, or a satellite.
- the above-mentioned device may complete the acquisition of the target image by interacting with the user.
- the above-mentioned device may provide the user with a window for inputting the target image to be processed through the interface mounted on the device, so that the user can input the image.
- the user can complete the input of the target image based on this window.
- the image can be input into the image processing model for calculation.
- the above-mentioned device can directly acquire the remote sensing image output by the remote sensing image acquisition system.
- the above-mentioned device may pre-establish a certain protocol with the remote sensing image acquisition system. After the remote sensing image acquisition system generates the remote sensing image, it can be sent to the above-mentioned equipment for image processing.
- the above-mentioned device may perform image processing on the above-mentioned target image using an image processing model, so as to extract the roof area of the building, the side bottom edge and the side top edge of the above-mentioned building, and the above-mentioned building from the above-mentioned target image.
- the offset angle between the roof and the base may be determined by the above-mentioned model, so as to extract the roof area of the building, the side bottom edge and the side top edge of the above-mentioned building, and the above-mentioned building from the above-mentioned target image.
- the above-mentioned image processing model may specifically be an image processing model for predicting the roof area of a building and predicting the above-mentioned related features for the target image.
- the image processing model may be a pre-trained neural network model.
- FIG. 3 is a schematic flowchart of image processing on a target image shown in the present disclosure.
- the above-mentioned image processing model may include three branches, and the three branches may share the same backbone network.
- the first branch can be used to predict the roof area
- the second branch can be used to predict the side edge (including the side top edge and the side bottom edge)
- the third branch can be used to predict the offset angle between the roof and the base of the building.
- the structure of the image processing model shown in FIG. 3 is only a schematic illustration, and in practical applications, the structure of the model can be built according to the actual situation.
- the above image processing model may be obtained by training based on a plurality of training samples marked with annotation information.
- the above-mentioned label information may include the roof area, the top edge of the side of the building and the bottom edge of the side of the building, and the offset angle between the roof and the base of the building.
- the above-mentioned backbone network is specifically used for feature prediction for the target image.
- the above-mentioned backbone network may be a feature prediction network such as VGG, ResNet, etc., which is not particularly limited here.
- VGG feature prediction network
- ResNet ResNet
- the above-mentioned branch 1 and branch 2 may specifically be a pixel-level segmentation network.
- the above-mentioned branch 1 can classify each pixel included in the target image as belonging to a category of roof and background.
- the target image includes pixel point A. If it is predicted that the pixel point A is a pixel point included in the roof area through the first branch, the pixel point A can be classified as belonging to the roof.
- the above-mentioned branch 2 can divide each pixel included in the target image into one of the top edge of the side, the bottom edge of the side and the background.
- the target image includes pixel point B. If it is predicted through the second branch above that the pixel point B is a pixel point included in the top edge of the side surface, the pixel point B can be classified as belonging to the top edge of the side surface.
- the above-mentioned branch 1 may divide each pixel included in the target image into a category of roof, side, and background.
- the above-mentioned branch 2 can classify each pixel included in the target image into one of the background, the edge between the roof and the background, the top edge of the side, the bottom edge of the side, and the hypotenuse edge of the left and right sides of the building side.
- the above-mentioned branch 3 may specifically be an image-level segmentation network.
- the above branch three can predict the offset angle between the base and the roof of the building included in the target image.
- the above-mentioned offset angle may refer to the angular offset between the base and the roof.
- the x-axis and the y-axis may belong to a rectangular coordinate system constructed with the lower left corner of the target image as the coordinate center.
- the offset angle may be the angle between the side hypotenuse and the vertical direction (eg, may be the angle between the side hypotenuse and the vertical downward direction).
- the tangent value corresponding to the angle obtained by subtracting 90 degrees from the included angle is the ratio of the change in the y-axis direction of the roof position and the base position to the change in the x-axis direction.
- FIG. 4 is a schematic diagram of an offset angle shown in the present disclosure.
- the coordinate system shown in FIG. 4 is a rectangular coordinate system constructed with the lower left corner of the target image as the center of the coordinate circle.
- the side bevel is the side where the base connects to the roof.
- the angle ⁇ is the above-mentioned offset angle.
- the offset angle of each building in the captured target image is approximately the same.
- the above-mentioned offset angle may also be other angles defined by the developer according to the actual situation, which can indicate the included angle between the base and the roof of the building.
- the above-mentioned offset angle may also be the included angle between the side hypotenuse and the vertical upward direction or the horizontal rightward direction, and so on.
- the above-mentioned side bottom edge is usually one of the edges included in the outline of the building base.
- the above-mentioned side top edge is usually one of the edges included in the roof profile of the building.
- S106 may be executed to determine the offset between the bottom edge of the side surface and the top edge of the side surface according to the offset angle quantity.
- the above-mentioned offset may specifically refer to the positional offset between the base and the roof.
- the above offset transforms the roof profile to the bottom profile.
- the above offset may be an offset vector. That is, the amount of movement of the roof and the base on the x-axis and y-axis.
- FIG. 5 is a schematic diagram of an offset between a roof and a base shown in the present disclosure.
- the coordinate system shown in FIG. 5 is a rectangular coordinate system constructed with the lower left corner of the target image as the center of the coordinate circle.
- Point P is a point on the base of the building.
- Point Q is the point corresponding to point P on the roof of the building.
- the offset (x2-x1, y2-y1) between the coordinates corresponding to the Q point and the coordinates corresponding to the P point is the above offset.
- the device when determining the offset between the bottom edge of the side surface and the top edge of the side surface according to the offset angle, the device may input the offset angle, the bottom edge of the side surface, and the top edge of the side surface The shift amount determination unit performs calculation to obtain the offset amount between the bottom edge of the side surface and the top edge of the side surface.
- the above offset determination unit is configured with an offset determination algorithm.
- the algorithm can determine the position change amount of the bottom edge of the side surface moving to the top edge of the side surface in the direction of the offset angle, and use the position change amount as the offset amount.
- the algorithm may also determine the position change amount of the top edge of the side surface moving to the bottom edge of the side surface in the direction of the offset angle, and use the position change amount as the offset amount.
- the offset can be determined by moving the top edge of the side or the bottom edge of the side.
- the principles of the two methods are the same, and the steps can be referred to each other.
- the following is an example of determining the offset by moving the bottom edge of the side. Example description.
- FIG. 6 is a method flowchart of an offset determination method shown in the present disclosure.
- the above-mentioned device may first execute S602, and based on the preset frame corresponding to the above-mentioned side top edge, crop the side top edge probability map corresponding to the above-mentioned side top edge to obtain a first cropping result.
- the above-mentioned preset frame may specifically be a preset frame including a side top edge.
- the target image can be cropped through the frame to obtain the pixels inside the frame.
- the circumscribed border corresponding to the above-mentioned side top edge may be determined as the above-mentioned preset border.
- the top edge belonging to the same building obtained through S104 may be discontinuous due to model reasons, so that the preset frame determined according to the top edge of the side cannot well crop the top edge, affecting the The offset determines the accuracy. Therefore, in order to obtain an accurate preset border, the preset border can be determined according to the roof profile and the side top edge together.
- a circumscribed frame corresponding to the combined edge obtained by combining a plurality of side top edges included in the roof profile may be determined as the above-mentioned preset frame.
- the preset frame is determined based on a combined edge obtained by combining a plurality of side top edges included in the roof outline, the preset frame may include a complete top edge, so that the first cropped result may include a complete top edge. side top edge, improving offset determination accuracy.
- the probability map of the side top edge corresponding to the above-mentioned side top edge may be cropped by the preset frame to obtain a first cropping result.
- the above-mentioned side top edge probability map may specifically be a top edge segmentation map obtained by performing image processing on the target image in S104.
- the figure includes the side top edge included in the target image above.
- the probability map is cropped according to the preset frame to obtain the first cropping result, which actually obtains an area including the top edge of the side of the building.
- S604 may be executed, according to the preset step size and the preset maximum offset, the above-mentioned side bottom edge is moved multiple times according to the above-mentioned offset angle direction, and after each movement, based on the above-mentioned
- the preset border clips the side bottom edge probability map corresponding to the above-mentioned side bottom edge to obtain a plurality of second clipping results.
- the above-mentioned preset step size specifically refers to the coordinate value of the bottom edge of the side surface moving along the x-axis direction.
- the preset step size can be set according to the actual situation. For example, if the resolution of the target image is larger, a larger preset step size can be set; otherwise, a smaller preset step size can be set.
- the preset step size may also be a coordinate value of the bottom edge of the side surface moving along the y-axis direction, which is not particularly limited here. The following description is given by taking the preset step as an example of moving the bottom edge of the side face along the x-axis by m.
- the above-mentioned preset maximum offset specifically refers to the maximum value of the movement of the bottom edge of the side surface along the x-axis direction.
- the preset maximum offset may be set according to the actual situation. For example, if the resolution of the target image is larger, a larger preset maximum offset may be set; otherwise, a smaller preset maximum offset may be set.
- the preset maximum offset may also be the maximum value of the movement of the bottom edge of the side surface along the y-axis direction, which is not particularly limited herein. The following description will be given by taking an example that the preset maximum offset is that the maximum value of the movement of the bottom edge of the side surface along the x-axis is n (where n is greater than m).
- the above-mentioned S604 may be executed, and according to the preset step size and the preset maximum offset, the bottom edge of the side surface is moved multiple times in the direction of the offset angle, and every time After the second movement, the bottom edge probability map of the side surface corresponding to the bottom edge of the side surface is cropped based on the preset frame to obtain a plurality of second cropping results.
- the above-mentioned side bottom edge probability map may specifically be a bottom edge segmentation map obtained by performing image processing on the target image in S104.
- the figure includes the side bottom edge included in the target image above.
- the second cropping result is obtained by cropping the probability map according to the preset frame, which actually obtains the area corresponding to the position of the bottom edge of the side of the building in the probability map of the side bottom edge.
- FIG. 7 is a schematic diagram of a movement process of a side bottom edge shown in the present disclosure.
- the coordinate system shown in FIG. 7 is a Cartesian coordinate system constructed with the lower left corner of the side bottom edge probability map as the coordinate circle center.
- DE is the initial position of the bottom edge of the side.
- FG is an intermediate position of the bottom edge of the side during the movement.
- HI is the last position of the bottom edge of the side when the movement ends (that is, the position corresponding to the preset maximum offset).
- the dotted frame is the determined preset frame.
- the bottom edge of the side starts from the initial position DE, and moves m steps in the x-axis direction and tan( ⁇ -90)*m steps in the y-axis direction each time until it moves to the HI position.
- the region corresponding to the coordinate position of the preset border on the side bottom edge probability map may be trimmed according to the preset border to obtain a second trimming result.
- S606 may be executed.
- a target cropping result matching the above-mentioned first cropping result is determined, and when the above-mentioned target cropping result is obtained, the bottom edge of the side surface is obtained.
- the position change amount is determined as the above-mentioned offset amount.
- a distance determination method such as Euclidean distance and Mahalanobis distance can be used to determine the similarity between the first trimming result and the second trimming result, and find the highest similarity from the determined similarity. After the highest similarity is determined, the second trimming result corresponding to the highest similarity is determined as the target trimming result. After the target cropping result is determined, the position change of the bottom edge of the side surface when the target cropping result is obtained may be determined as the offset.
- the second cropping result obtained by cropping when the bottom edge of the side surface moves to the position FG is the above-mentioned target cropping result.
- the combination of the changes in the x-axis and the y-axis when the bottom edge of the side surface moves from the position DE to the position FG can be determined as the above offset.
- the above scheme since the second cropping result obtained by cropping according to the above preset frame is most similar to the first cropping result only when the bottom edge of the side is moved to the position of the top edge of the side, the above scheme is adopted, by dividing the When the bottom edge of the side surface moves to the top edge of the side surface in the direction of the offset angle, the position change of the bottom edge of the side surface is determined as the offset amount, and an accurate offset amount can be determined.
- S108 may be continued, and according to the offset, the roof contour corresponding to the roof area is transformed to obtain the base contour.
- the roof profile corresponding to the roof area can be determined first.
- the outline enclosed by the outermost pixel points corresponding to the roof area is determined as the above-mentioned roof outline.
- the roof contour corresponding to the roof area may be transformed according to the offset to obtain the base contour.
- the coordinates corresponding to each pixel included in the roof outline can be translated in the x-axis and y-axis directions by the offset to obtain the coordinates corresponding to each pixel included in the outline of the base, thereby determining the outline of the foundation and completing the prediction of the foundation .
- the roof area of the building, the side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building are determined from the acquired target image. Then, according to the offset angle, the offset between the bottom edge of the side surface and the top edge of the side surface is determined. Finally, according to the above offset, the roof contour corresponding to the above roof area is transformed to obtain the base contour, so as to realize the translation transformation of the roof contour with obvious visual characteristics, obtain the building base contour, and complete the building base prediction.
- the roof contour in order to improve the prediction accuracy of the building base, after the roof contour is obtained, the roof contour may be regularized.
- FIG. 8 is a schematic diagram of a base profile prediction process shown in the present disclosure.
- the edge directions corresponding to each pixel included in the roof outline of the building can also be obtained.
- the above-mentioned edge direction specifically refers to the normal vector direction of the edge.
- the above edge direction is usually quantified by the edge direction angle.
- the above-mentioned edge direction angle may be the angle between the above-mentioned normal vector and the vertical direction (for example, may be the above-mentioned angle between the above-mentioned normal vector and the above-mentioned vertical downward direction).
- FIG. 9 is a schematic diagram of an edge direction shown in the present disclosure.
- the direction corresponding to the normal vector LR of the edge JK is the edge direction of the aforementioned edge JK.
- a vertical downward direction vector LS can be constructed, and the angle ⁇ of the edge direction can be indicated by the included angle ⁇ between the normal vector LR and the direction vector LS. It can be understood that, the edge direction corresponding to the pixel points included in a certain edge is generally consistent with the edge direction corresponding to the edge.
- the above-mentioned device may divide the preset angle in advance to obtain N angle intervals; wherein, N is a positive integer.
- the above-mentioned preset angle may be an empirical angle. For example, 360 degrees or 720 degrees, etc.
- the N in order to reduce the number of edge direction types and expand the span of the angle interval, the N may be a positive integer less than or equal to the second preset threshold.
- the second preset threshold may be a suitable number of angle intervals determined by experience.
- the span range of the angle interval is expanded, and the value range of the edge direction type is narrowed, so that when quantizing the edge direction, there is no need to rely too much on the prediction accuracy of the above image processing model. It can improve the prediction accuracy of edge direction, so as to further extract more accurate roof profiles and improve the prediction accuracy of building bases.
- identification values may be assigned to the above N angle intervals, respectively.
- the above-mentioned identification value corresponds to the angle interval one-to-one.
- the number sequence of the angle intervals may be used as the above identification value. For example, when a certain angle interval is the third interval, 3 may be used as the identification value corresponding to the angle interval.
- the above-mentioned target image can be input into a pre-built edge direction prediction model to predict the edge direction.
- FIG. 10 is a schematic flowchart of image processing on a target image according to the present disclosure.
- the above-mentioned image processing model also includes a fourth branch.
- the fourth branch can be used to predict the edge direction corresponding to each pixel point included in the roof outline.
- the branch four shares the above-mentioned backbone network with the other three branches.
- the above image processing model may be obtained by training based on a plurality of training samples marked with annotation information.
- the above-mentioned label information further includes the edge direction corresponding to each pixel point.
- each pixel of the original image can be marked with an identification value corresponding to the angle interval to which its edge direction belongs. For example, after the original image is acquired, each pixel of the original image can be traversed, and the identification value corresponding to the angle interval to which the edge direction of each pixel belongs is annotated by labeling software.
- the above-mentioned roof outline can be regularized based on the above-mentioned edge direction to obtain the roof polygon corresponding to the above-mentioned building.
- each pixel point in each pixel point included in the above roof outline can be used as the target pixel point. , to determine the direction difference between the edge direction corresponding to the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point.
- the target pixel point is determined as the vertex of the roof polygon corresponding to the building.
- the above-mentioned adjacent pixel point may refer to any pixel point in two pixel points adjacent to the target pixel point.
- the above-mentioned first preset threshold may be a threshold set according to experience. When the direction difference between the edge directions corresponding to two adjacent pixels reaches the threshold, it can indicate that the two pixels do not belong to the same edge. Therefore, according to the above steps, among the pixels included in the roof outline, the pixels belonging to the same edge as the pixels included in the background can be deleted, and the pixels that do not belong to the background are retained, so as to achieve the purpose of regularizing the roof outline.
- the value of the above-mentioned first preset threshold is also different.
- the edge orientation is quantified by the edge orientation angle
- the first preset threshold may be 30.
- the edge direction is quantified by the identification value corresponding to the angle interval to which the edge direction angle belongs, the first preset threshold may be 3.
- the target pixel point when determining the direction difference between the edge direction corresponding to the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point, the target pixel point can be determined.
- the first edge direction angle corresponding to the pixel point, and the second edge direction angle corresponding to the adjacent pixel points of the target pixel point are determined.
- the difference between the first edge direction angle and the second edge direction angle may be determined as the edge direction corresponding to the target pixel point and the target pixel point The direction difference between the edge directions corresponding to the adjacent pixels of .
- the first angle interval to which the edge direction corresponding to the target pixel point belongs may be determined , and determine the second angle interval to which the edge directions corresponding to the adjacent pixel points of the target pixel point belong.
- the difference between the identification value corresponding to the first angle interval and the identification value corresponding to the second angle interval may be determined as the edge direction corresponding to the target pixel point and the above The direction difference between the edge directions corresponding to the adjacent pixels of the target pixel.
- the roof polygon corresponding to the above-mentioned building can be obtained based on the determined vertices of the roof polygon.
- the vertices of the roof polygon may be corrected based on the vertex correction model to obtain the corrected roof polygon;
- the vertex correction model is a model determined based on a graph neural network.
- the above-mentioned vertex correction models may be two models independent of the image processing model used in the present disclosure, or may be a sub-model (sub-module) of the image processing model, which is not particularly limited here.
- the above-mentioned vertex correction model can be regarded as a sub-model (sub-module) of the image processing model, and the vertex correction can be performed by using this sub-model (sub-module).
- the vertex correction model based on the graph neural network is used in this scheme to further correct the roof polygon, a more accurate roof profile can be obtained and the prediction accuracy of the building base can be improved.
- the above roof polygon can be transformed according to the above offset to obtain the outline of the base.
- this scheme can predict a more accurate roof outline, thereby improving the prediction accuracy of the building base.
- LIDAR laser radar
- DSM digital surface model
- the building height may also be predicted based on a single-scene target image. Specifically, after the offset amount is determined, based on the offset amount and a predetermined scale between the height of the building and the offset amount, the corresponding building height may be determined as the height of the building.
- the real heights of some buildings may be acquired in advance, and the offsets corresponding to these buildings determined by using the offset determination method described in the present disclosure. Then, the scale between the building height and the offset is determined based on the above data.
- the corresponding building height can be determined according to the predicted offset.
- the corresponding building height may be obtained based on the above offset and a predetermined scale between the building height and the offset. Therefore, when predicting the height of a building, there is no need to rely on remote sensing images, lidar (LIDAR) data, digital surface model (DSM) and other data from multiple scenes and different perspectives, thereby reducing the cost and difficulty of building height prediction.
- LIDAR lidar
- DSM digital surface model
- the image processing model used in the building base prediction scheme may include a roof area prediction sub-model for outputting the above-mentioned roof area, and a building edge prediction sub-model for outputting the above-mentioned side bottom edge and the above-mentioned side top edge, A building edge direction prediction sub-model for outputting the above-mentioned edge direction, and an offset angle prediction sub-model for outputting the above-mentioned offset angle.
- the multi-task joint training method is adopted when training the image processing model.
- the roof area and side area of the building, the edges included in the building outline, and the edges included in the building may be introduced. Constraints on the edge direction corresponding to each pixel, the offset angle between the roof and the base, etc.
- FIG. 11 is a method flowchart of an image processing model training method shown in the present disclosure.
- the method includes:
- S1102 acquiring a plurality of training samples involving buildings and including labeling information; wherein the labeling information includes the roof area and side area of the building, each edge included in the building outline, and each pixel included in the building. Edge orientation, and the offset angle between the roof and plinth of the aforementioned building.
- the original image can be labeled with the labeling information by means of manual labeling or machine-assisted labeling.
- image annotation software can be used to mark each pixel included in the original image as belonging to the roof, side area or background of the building; mark which edge included in the outline of the building it belongs to ; label its corresponding edge direction; on the other hand, the offset angle between the roof and the base of the building included in this image can be labelled.
- the training samples can be obtained after completing the above labeling operations for the original image.
- one-hot encoding or other methods may be used for encoding, and the present disclosure does not limit the specific encoding method.
- S1104 Construct joint learning loss information based on loss information corresponding to each sub-model included in the image processing model.
- the corresponding loss information of each sub-model may be determined first.
- the loss information corresponding to each of the above-mentioned sub-models is the cross-entropy loss information.
- joint learning loss information may be constructed based on the corresponding loss information of each sub-model included in the above image processing model. For example, the loss information corresponding to each sub-model can be added to obtain the above-mentioned joint learning loss information.
- a regularization term may also be added to the above-mentioned joint learning loss information, which is not particularly limited here.
- S1106 may be executed to jointly train each sub-model included in the above-mentioned image processing model by using a plurality of the above-mentioned training samples based on the above-mentioned joint learning loss information, until the above-mentioned sub-models converge.
- the above-mentioned image processing model can be supervised based on the above-mentioned training samples marked with annotation information.
- the error between the annotation information and the above calculation results can be evaluated based on the constructed joint learning loss information.
- the stochastic gradient descent method can be used to determine the descending gradient.
- the model parameters corresponding to the above image processing model can be updated based on backpropagation. The above process is repeated until the above sub-model models converge.
- the embodiments of the present disclosure do not specifically limit the conditions for model convergence.
- the four sub-models included in the image processing can be trained simultaneously, so that the sub-models can constrain each other during the training process. , and can promote each other, so as to improve the convergence efficiency of the image processing model on the one hand; on the other hand, the backbone network shared by each sub-model can predict features that are more beneficial to the prediction of the base area, thereby improving the accuracy of the base prediction.
- the present disclosure further provides an image processing apparatus.
- FIG. 12 is a schematic diagram of an image processing apparatus shown in the present disclosure.
- the above-mentioned apparatus 1200 includes: an acquisition module 1210 for acquiring a target image including a building; an image processing module 1220 for performing image processing on the above-mentioned target image to determine the roof area of the above-mentioned building.
- the side bottom edge and the side top edge of the object, and the offset angle between the roof and the base of the above-mentioned building; the offset determination module 1230 is used to determine the above-mentioned side bottom edge and the above-mentioned side top edge according to the above-mentioned offset angle.
- the transformation module 1240 is configured to transform the roof contour corresponding to the roof area according to the offset to obtain the base contour.
- the above-mentioned apparatus 1200 further includes: a building height determination module, configured to determine the height of the above-mentioned building based on the above-mentioned offset amount and a predetermined scale between the building height and the offset amount.
- a building height determination module configured to determine the height of the above-mentioned building based on the above-mentioned offset amount and a predetermined scale between the building height and the offset amount.
- the above-mentioned apparatus 1200 further includes: an edge direction determination module, configured to perform image processing on the above-mentioned target image, and determine the respective edge directions corresponding to each pixel included in the roof outline of the above-mentioned building; the above-mentioned apparatus 1200 It also includes: a regularization processing module, based on the above-mentioned edge direction, performs regular processing on the above-mentioned roof outline, and obtains the roof polygon corresponding to the above-mentioned building; the above-mentioned transformation module 1240 is specifically used for: according to the above-mentioned offset, the above-mentioned roof polygon The transformation is performed to obtain the above-mentioned base outline.
- an edge direction determination module configured to perform image processing on the above-mentioned target image, and determine the respective edge directions corresponding to each pixel included in the roof outline of the above-mentioned building
- the above-mentioned apparatus 1200 It also includes: a regularization processing module
- the above-mentioned regularization processing module includes: a first determination sub-module, configured to use any pixel point among the pixel points included in the above-mentioned roof outline as a target pixel point, and determine the edge corresponding to the above-mentioned target pixel point The direction difference between the direction and the edge direction corresponding to the adjacent pixel points of the above-mentioned target pixel point; the second determination sub-module is used for the direction difference between the edge direction corresponding to the above-mentioned target pixel point and the edge direction corresponding to the above-mentioned adjacent pixel points to reach
- the target pixel point is determined as the vertex of the roof polygon corresponding to the building;
- the roof polygon determination sub-module obtains the roof polygon corresponding to the building based on the determined vertex of the roof polygon.
- the above-mentioned apparatus 1200 further includes: a dividing module. Dividing the preset angle to obtain N angle intervals, and assigning identification values to the above N angle intervals; wherein, N is a positive integer; the above-mentioned first determination sub-module is specifically used for: determining the edge direction corresponding to the above-mentioned target pixel point the first angle interval to which it belongs; determine the second angle interval to which the edge direction corresponding to the adjacent pixel point of the above-mentioned target pixel point belongs; the difference between the identification value corresponding to the above-mentioned first angle interval and the identification value corresponding to the above-mentioned second angle interval It is determined as the direction difference between the edge direction corresponding to the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point.
- a dividing module Dividing the preset angle to obtain N angle intervals, and assigning identification values to the above N angle intervals; wherein, N is a positive integer; the above
- N is a positive integer less than or equal to the second preset threshold.
- the above-mentioned apparatus 1200 further includes: a vertex correction module, based on the vertex correction model, to correct the vertices of the above-mentioned roof polygon to obtain a corrected roof polygon; wherein, the above-mentioned vertex correction model is based on a graph neural network definite model.
- the offset determination module 1230 includes: an offset determination sub-module, configured to determine the position change of the bottom edge of the side surface moving to the top edge of the side surface in the direction of the offset angle,
- the above-mentioned position change amount is regarded as the above-mentioned offset amount.
- the above-mentioned offset determination sub-module is specifically configured to: based on the preset frame corresponding to the above-mentioned side top edge, crop the side top edge probability map corresponding to the above-mentioned side top edge to obtain a first cropping
- the above-mentioned side top edge probability map includes the image area in the above-mentioned target image including the above-mentioned side surface top edge; according to the preset step size and the preset maximum offset, the above-mentioned side bottom edge is moved multiple times according to the above-mentioned offset angle direction , and after each movement, the side bottom edge probability map corresponding to the side bottom edge is cropped based on the preset frame to obtain a plurality of second cropping results, where the side bottom edge probability map includes the target image including the side The image area of the bottom edge; in the above-mentioned multiple second cropping results, determine the target cropping result that matches the above-mentioned first cropping result, and determine
- the above-mentioned apparatus 1200 further includes: determining the circumscribed frame corresponding to the above-mentioned side top edge as the above-mentioned preset frame; or, a combination obtained by combining a plurality of side-side top edges included in the above-mentioned roof outline The circumscribed frame corresponding to the edge is determined as the above-mentioned preset frame.
- the above-mentioned image processing module 1220 is specifically configured to: use an image processing model to perform image processing on the above-mentioned target image, and determine the roof area of the above-mentioned building, the side bottom edge and the side top edge of the above-mentioned building, And the offset angle between the roof and the base of the above-mentioned building;
- the above-mentioned image processing model includes a roof area prediction sub-model for outputting the above-mentioned roof area, for outputting the building edge prediction of the above-mentioned side bottom edge and the above-mentioned side top edge
- the training device 1300 corresponding to the training method of the image processing model above includes: a training sample acquisition module 1310 for acquiring a plurality of training samples related to buildings and including labeling information; wherein the labeling information includes The roof area and side area of the above-mentioned building, each edge included in the building outline, the edge direction corresponding to each pixel point included in the above-mentioned building, and the offset angle between the roof and the base of the above-mentioned building; loss information determination module 1320 , for constructing joint learning loss information based on the loss information corresponding to each of the above-mentioned sub-models included in the above-mentioned image processing model; the joint training module 1330 is used for processing the above-mentioned image by using a plurality of above-mentioned training samples based on the above-mentioned joint learning loss information Each of the above-mentioned sub-models included in the model is jointly trained until each of the above-mentioned sub-models
- an electronic device which may include: a processor.
- Memory used to store processor-executable instructions.
- the above-mentioned processor is configured to invoke the executable instructions stored in the above-mentioned memory to implement the image processing method shown in any of the above-mentioned embodiments.
- FIG. 13 is a hardware structure diagram of an electronic device shown in the present disclosure.
- the electronic device may include a processor for executing instructions, a network interface for performing network connection, a memory for storing operating data for the processor, and a non-volatile memory for storing instructions corresponding to the image processing apparatus. volatile memory.
- the embodiment of the image processing apparatus may be implemented by software, or may be implemented by hardware or a combination of software and hardware.
- a device in a logical sense is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where the device is located.
- the electronic device where the apparatus is located in the embodiment may also include other Hardware, no further details on this.
- the corresponding instructions of the image processing apparatus may also be directly stored in the memory, which is not limited herein.
- the present disclosure provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute and implement the image processing method shown in any of the foregoing embodiments.
- one or more embodiments of the present disclosure may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present disclosure may employ a computer implemented on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein The form of the program product.
- computer-usable storage media which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.
- Embodiments of the subject matter and functional operations described in this disclosure can be implemented in digital electronic circuitry, in tangible embodiment of computer software or firmware, in computer hardware that can include the structures disclosed in this disclosure and their structural equivalents, or in A combination of one or more of.
- Embodiments of the subject matter described in this disclosure may be implemented as one or more computer programs, ie, one or more of computer program instructions encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. multiple modules.
- the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for interpretation by the data.
- the processing device executes.
- the computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of these.
- the processes and logic flows described in this disclosure can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output.
- the processes and logic flows described above can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, eg, an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).
- FPGA Field Programmable Gate Array
- ASIC Application Specific Integrated Circuit
- a computer suitable for the execution of a computer program may include, for example, a general and/or special purpose microprocessor, or any other type of central processing unit.
- the central processing unit will receive instructions and data from read only memory and/or random access memory.
- the basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
- a computer will also include, or be operably coupled to, such mass storage devices to receive data therefrom or to include one or more mass storage devices, such as magnetic disks, magneto-optical disks, or optical disks, etc., for storing data. Send data to it, or both.
- the computer does not have to have such a device.
- the computer may be embedded in another device, such as a mobile phone, personal digital assistant (PDA), mobile audio or video player, game console, global positioning system (GPS) receiver, or a universal serial bus (USB) ) flash drives for portable storage devices, to name a few.
- PDA personal digital assistant
- GPS global positioning system
- USB universal serial bus
- Computer readable media suitable for storage of computer program instructions and data may include all forms of non-volatile memory, media, and memory devices, and may include, for example, semiconductor memory devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard disks) or removable discs), magneto-optical discs, and CD-ROM and DVD-ROM discs.
- semiconductor memory devices eg, EPROM, EEPROM, and flash memory devices
- magnetic disks eg, internal hard disks
- removable discs removable discs
- magneto-optical discs e.g., CD-ROM and DVD-ROM discs.
- the processor and memory may be supplemented by or incorporated in special purpose logic circuitry.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (15)
- 一种图像处理方法,其特征在于,所述方法包括:获取包含建筑物的目标图像;对所述目标图像进行图像处理,确定所述建筑物的屋顶区域、所述建筑物的侧面底部边缘与侧面顶部边缘、以及所述建筑物的屋顶与底座之间的偏移角;根据所述偏移角,确定所述侧面底部边缘与所述侧面顶部边缘之间的偏移量;根据所述偏移量,对所述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:对所述目标图像进行图像处理,确定所述建筑物的屋顶轮廓包括的各像素点分别对应的边缘方向;基于所述边缘方向,对所述屋顶轮廓进行规则化处理,得到所述建筑物对应的屋顶多边形;所述根据所述偏移量,对所述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓,包括:根据所述偏移量,对所述屋顶多边形进行变换,得到所述底座轮廓。
- 根据权利要求2所述的方法,其特征在于,所述基于所述边缘方向,对所述屋顶轮廓进行规则化处理,得到所述建筑物对应的屋顶多边形,包括:将所述屋顶轮廓包括的各像素点中任一像素点作为目标像素点,确定所述目标像素点对应的边缘方向与所述目标像素点的相邻像素点对应的边缘方向之方向差;在所述方向差达到第一预设阈值的情况下,将所述目标像素点确定为所述建筑物对应的屋顶多边形的顶点;基于确定的屋顶多边形的顶点,得到所述建筑物对应的屋顶多边形。
- 根据权利要求3所述的方法,其特征在于,所述方法还包括:将预设角度进行划分得到N个角度区间,并为所述N个角度区间分配标识值;其中,N为正整数;所述确定所述目标像素点对应的边缘方向与所述目标像素点的相邻像素点对应的边缘方向之方向差,包括:确定所述目标像素点对应的边缘方向所属的第一角度区间;确定所述目标像素点的相邻像素点对应的边缘方向所属的第二角度区间;将所述第一角度区间对应的标识值与所述第二角度区间对应的标识值之差,确定为所述目标像素点对应的边缘方向与所述目标像素点的相邻像素点对应的边缘方向之方向差。
- 根据权利要求4所述的方法,其特征在于,N为小于等于第二预设阈值的正整数。
- 根据权利要求3至5任一所述的方法,其特征在于,所述方法还包括:基于顶点修正模型,对所述屋顶多边形的顶点进行修正,得到修正后的屋顶多边形;其中,所述顶点修正模型为基于图神经网络确定的模型。
- 根据权利要求1至6任一所述的方法,其特征在于,所述根据所述偏移角,确定所述侧面底部边缘与所述侧面顶部边缘之间的偏移量,包括:确定所述侧面底部边缘按照所述偏移角的方向移动至所述侧面顶部边缘的位置变化量,作为所述偏移量。
- 根据权利要求7所述的方法,其特征在于,所述确定所述侧面底部边缘按照所述偏移角的方向移动至所述侧面顶部边缘的位置变化量作为所述偏移量,包括:基于与所述侧面顶部边缘对应的预设边框,对所述侧面顶部边缘对应的侧面顶部边缘概率图进行裁剪,得到第一裁剪结果,所述侧面顶部边缘概率图包括所述目标图像中包含所述侧面顶部边缘的图像区域;按照预设步长以及预设最大偏移量,将所述侧面底部边缘按照所述偏移角方向进行多次移动,并在每次移动后,基于所述预设边框对所述侧面底部边缘对应的侧面底部边缘概率图进行裁剪,得到多个第二裁剪结果,所述侧面底部边缘概率图包括所述目标图像中包含所述侧面底部边缘的图像区域;在所述多个第二裁剪结果中,确定与所述第一裁剪结果匹配的目标裁剪结果,并将得到所述目标裁剪结果时所述侧面底部边缘的位置变化量确定为所述偏移量。
- 根据权利要求8所述的方法,其特征在于,所述预设边框通过以下任一确定:将所述侧面顶部边缘对应的外切边框,确定为所述预设边框;或,将所述屋顶轮廓包括的多个侧面顶部边缘进行组合得到的组合边缘所对应的外切边框,确定为所述预设边框。
- 根据权利要求2至9任一所述的方法,其特征在于,所述对所述目标图像进行图像处理,确定所述建筑物的屋顶区域、所述建筑物的侧面底部边缘与侧面顶部边缘、以及所述建筑物的屋顶与底座之间的偏移角,包括:利用图像处理模型对所述目标图像进行图像处理,确定所述建筑物的屋顶区域、所述建筑物的侧面底部边缘与侧面顶部边缘、以及所述建筑物的屋顶与底座之间的偏移角;其中,所述图像处理模型包括用于输出所述屋顶区域的屋顶区域预测子模型、用于输出所述侧面底部边缘与所述侧面顶部边缘的建筑物边缘预测子模型、用于输出所述边缘方向的建筑物边缘方向预测子模型、以及用于输出所述偏移角的偏移角预测子模型。
- 根据权利要求10所述的方法,其特征在于,所述图像处理模型通过以下训练得到:获取多个涉及建筑物并包括标注信息的训练样本;其中,所述标注信息包括所述建筑物的屋顶区域与侧面区域、建筑物轮廓包括的各边缘,所述建筑物包括的各像素点对应的边缘方向、以及所述建筑物的屋顶与底座之间的偏移角;基于所述图像处理模型包括的各所述子模型分别对应的损失信息,构建联合学习损失信息;基于所述联合学习损失信息,利用多个所述训练样本对所述图像处理模型包括的各所述子模型进行联合训练,直至各所述子模型收敛。
- 根据权利要求1至11任一所述的方法,其特征在于,所述方法还包括:基于所述偏移量、以及预先确定的建筑物高度与偏移量之间的比例尺,确定所述建筑物的建筑物高度。
- 一种图像处理装置,其特征在于,所述装置包括:获取模块,用于获取包含建筑物的目标图像;图像处理模块,用于对所述目标图像进行图像处理,确定所述建筑物的屋顶区域、所述建筑物的侧面底部边缘与侧面顶部边缘、及所述建筑物的屋顶与底座之间的偏移角;偏移量确定模块,用于根据所述偏移角,确定所述侧面底部边缘与所述侧面顶部边缘之间的偏移量;变换模块,用于根据所述偏移量,对所述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
- 一种电子设备,其特征在于,所述设备包括处理器和用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器中存储的可执行指令,实现权利要求1至12中任一项所述的图像处理方法。
- 一种计算机可读存储介质,其特征在于,所述存储介质存储有计算机程序,所述计算机程序用于执行权利要求1至12中任一项所述的图像处理方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011036378.9 | 2020-09-27 | ||
CN202011036378.9A CN112037220A (zh) | 2020-09-27 | 2020-09-27 | 一种图像处理方法、装置、设备和存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022062854A1 true WO2022062854A1 (zh) | 2022-03-31 |
Family
ID=73574957
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/115515 WO2022062854A1 (zh) | 2020-09-27 | 2021-08-31 | 一种图像处理方法、装置、设备和存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112037220A (zh) |
WO (1) | WO2022062854A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113256790A (zh) * | 2021-05-21 | 2021-08-13 | 珠海金山网络游戏科技有限公司 | 建模方法及装置 |
CN116342591A (zh) * | 2023-05-25 | 2023-06-27 | 兴润建设集团有限公司 | 一种建筑物参数的视觉解析方法 |
CN117455815A (zh) * | 2023-10-18 | 2024-01-26 | 二十一世纪空间技术应用股份有限公司 | 基于卫星影像平顶建筑物顶底偏移校正方法及相关设备 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112037220A (zh) * | 2020-09-27 | 2020-12-04 | 上海商汤智能科技有限公司 | 一种图像处理方法、装置、设备和存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1526108A (zh) * | 2001-02-14 | 2004-09-01 | 无线谷通讯有限公司 | 对地形、建筑、和基础设施进行建模和管理的方法和系统 |
CN104240247A (zh) * | 2014-09-10 | 2014-12-24 | 无锡儒安科技有限公司 | 一种基于单张图片的建筑物俯视轮廓的快速提取方法 |
CN112037220A (zh) * | 2020-09-27 | 2020-12-04 | 上海商汤智能科技有限公司 | 一种图像处理方法、装置、设备和存储介质 |
-
2020
- 2020-09-27 CN CN202011036378.9A patent/CN112037220A/zh active Pending
-
2021
- 2021-08-31 WO PCT/CN2021/115515 patent/WO2022062854A1/zh active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1526108A (zh) * | 2001-02-14 | 2004-09-01 | 无线谷通讯有限公司 | 对地形、建筑、和基础设施进行建模和管理的方法和系统 |
CN104240247A (zh) * | 2014-09-10 | 2014-12-24 | 无锡儒安科技有限公司 | 一种基于单张图片的建筑物俯视轮廓的快速提取方法 |
CN112037220A (zh) * | 2020-09-27 | 2020-12-04 | 上海商汤智能科技有限公司 | 一种图像处理方法、装置、设备和存储介质 |
Non-Patent Citations (2)
Title |
---|
KEQI ZHANG, JIANHUA YAN, AND SHU-CHING CHEN: "Automatic Construction of Building Footprints From Airborne LIDAR Data", IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 1 October 2006 (2006-10-01), pages 1 - 11, XP055914558, [retrieved on 20220421] * |
XU, YONGZHI: "Extracting Building Footprints from Digital Measurable Images", CHINESE MASTER'S THESES FULL-TEXT DATABASE, ENGINEERING SCIENCE II, 15 February 2014 (2014-02-15), XP055914553, [retrieved on 20220421] * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113256790A (zh) * | 2021-05-21 | 2021-08-13 | 珠海金山网络游戏科技有限公司 | 建模方法及装置 |
CN113256790B (zh) * | 2021-05-21 | 2024-06-07 | 珠海金山数字网络科技有限公司 | 建模方法及装置 |
CN116342591A (zh) * | 2023-05-25 | 2023-06-27 | 兴润建设集团有限公司 | 一种建筑物参数的视觉解析方法 |
CN117455815A (zh) * | 2023-10-18 | 2024-01-26 | 二十一世纪空间技术应用股份有限公司 | 基于卫星影像平顶建筑物顶底偏移校正方法及相关设备 |
Also Published As
Publication number | Publication date |
---|---|
CN112037220A (zh) | 2020-12-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108717710B (zh) | 室内环境下的定位方法、装置及系统 | |
WO2022062854A1 (zh) | 一种图像处理方法、装置、设备和存储介质 | |
WO2022062543A1 (zh) | 一种图像处理方法、装置、设备和存储介质 | |
US10953545B2 (en) | System and method for autonomous navigation using visual sparse map | |
RU2713611C2 (ru) | Способ моделирования трехмерного пространства | |
CN113256712B (zh) | 定位方法、装置、电子设备和存储介质 | |
US11941831B2 (en) | Depth estimation | |
JP2019087229A (ja) | 情報処理装置、情報処理装置の制御方法及びプログラム | |
CN110276768B (zh) | 图像分割方法、图像分割装置、图像分割设备及介质 | |
US20190301871A1 (en) | Direct Sparse Visual-Inertial Odometry Using Dynamic Marginalization | |
CN110111364B (zh) | 运动检测方法、装置、电子设备及存储介质 | |
US11790661B2 (en) | Image prediction system | |
US20220164603A1 (en) | Data processing method, data processing apparatus, electronic device and storage medium | |
CN114641800A (zh) | 用于预报人群动态的方法和系统 | |
CN115421158A (zh) | 自监督学习的固态激光雷达三维语义建图方法与装置 | |
CN113658203A (zh) | 建筑物三维轮廓提取及神经网络的训练方法和装置 | |
US20220164595A1 (en) | Method, electronic device and storage medium for vehicle localization | |
CN117132649A (zh) | 人工智能融合北斗卫星导航的船舶视频定位方法及装置 | |
Ribacki et al. | Vision-based global localization using ceiling space density | |
CN113916223B (zh) | 定位方法及装置、设备、存储介质 | |
US11915449B2 (en) | Method and apparatus for estimating user pose using three-dimensional virtual space model | |
KR20230029981A (ko) | 포즈 결정을 위한 시스템 및 방법 | |
US10447992B1 (en) | Image processing method and system | |
Abdelaal et al. | Gramap: Qos-aware indoor mapping through crowd-sensing point clouds with grammar support | |
Kanai et al. | Improvement of 3D Monte Carlo localization using a depth camera and terrestrial laser scanner |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21871226 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022545960 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04.09.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21871226 Country of ref document: EP Kind code of ref document: A1 |