WO2022062854A1 - 一种图像处理方法、装置、设备和存储介质 - Google Patents

一种图像处理方法、装置、设备和存储介质 Download PDF

Info

Publication number
WO2022062854A1
WO2022062854A1 PCT/CN2021/115515 CN2021115515W WO2022062854A1 WO 2022062854 A1 WO2022062854 A1 WO 2022062854A1 CN 2021115515 W CN2021115515 W CN 2021115515W WO 2022062854 A1 WO2022062854 A1 WO 2022062854A1
Authority
WO
WIPO (PCT)
Prior art keywords
building
roof
edge
offset
mentioned
Prior art date
Application number
PCT/CN2021/115515
Other languages
English (en)
French (fr)
Inventor
李唯嘉
孟令宣
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2022062854A1 publication Critical patent/WO2022062854A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/564Depth or shape recovery from multiple images from contours
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10044Radar image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20088Trinocular vision calculations; trifocal tensor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30132Masonry; Concrete

Definitions

  • the present disclosure relates to the field of computer technologies, and in particular, to an image processing method, apparatus, device, and storage medium.
  • the base of the building may be partially occluded in the image, resulting in inconspicuous visual features, which affects the prediction accuracy of the base of the building. .
  • the present disclosure discloses at least one image processing method, the method includes: acquiring a target image including a building; performing image processing on the target image to determine the roof area of the building, the side bottom edge of the building and the The top edge of the side, and the offset angle between the roof and the base of the building; according to the offset angle, determine the offset between the bottom edge of the side and the top edge of the side; The roof contour corresponding to the roof area is transformed to obtain the base contour.
  • the method further includes: determining the height of the building based on the offset and a predetermined scale between the height of the building and the offset.
  • the method further includes: performing image processing on the target image to determine the edge directions corresponding to each pixel included in the roof outline of the building; the method further includes: based on the edge directions,
  • the above-mentioned roof outline is subjected to regularization processing to obtain the roof polygon corresponding to the above-mentioned building; the above-mentioned transforming the roof outline corresponding to the above-mentioned roof area according to the above-mentioned offset amount, to obtain the base outline, including: according to the above-mentioned offset amount, the above-mentioned roof
  • the polygon is transformed to obtain the above-mentioned base outline.
  • the above-mentioned regularization processing is performed on the above-mentioned roof outline based on the above-mentioned edge direction to obtain a roof polygon corresponding to the above-mentioned building, including: using any pixel point among the pixel points included in the above-mentioned roof outline as a target Pixel point, determine the direction difference between the edge direction corresponding to the above-mentioned target pixel point and the edge direction corresponding to the adjacent pixel point of the above-mentioned target pixel point; When the direction difference reaches the first preset threshold, the target pixel point is determined as the vertex of the roof polygon corresponding to the building; based on the determined vertex of the roof polygon, the roof polygon corresponding to the building is obtained.
  • the above method further includes: dividing the preset angle to obtain N angle intervals, and assigning identification values to the N angle intervals; wherein, N is a positive integer; the above-mentioned determining that the target pixel corresponds to The direction difference between the edge direction of the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point includes: determining the first angle interval to which the edge direction corresponding to the target pixel point belongs; determining that the adjacent pixel points of the target pixel point correspond to The second angle interval to which the edge direction belongs to; the difference between the identification value corresponding to the above-mentioned first angle interval and the identification value corresponding to the above-mentioned second angle interval is determined as the adjacent edge direction corresponding to the above-mentioned target pixel point and the above-mentioned target pixel point The direction difference between the edge directions corresponding to the pixel points.
  • N is a positive integer less than or equal to the second preset threshold.
  • the method further includes: correcting the vertices of the roof polygon based on the vertex correction model to obtain a corrected roof polygon; wherein the vertex correction model is a model determined based on a graph neural network.
  • determining the offset between the bottom edge of the side surface and the top edge of the side surface according to the offset angle includes: determining that the bottom edge of the side surface moves to the side surface in the direction of the offset angle The position change amount of the top edge, and the above position change amount is the above offset amount.
  • determining the position change amount of the bottom edge of the side surface moving to the top edge of the side surface in the direction of the offset angle, and using the position change amount as the offset amount includes: based on the difference between the side surface and the side surface The preset border corresponding to the top edge is cropped to the side top edge probability map corresponding to the above-mentioned side top edge to obtain a first cropping result, and the above-mentioned side top edge probability map includes the above-mentioned target image.
  • the probability map is cropped to obtain a plurality of second cropping results, and the above-mentioned side bottom edge probability map includes the image area including the above-mentioned side bottom edge in the above-mentioned target image; among the above-mentioned plurality of second cropping results, it is determined that the above-mentioned first cropping result is the same as the above-mentioned first cropping result.
  • the matching target cropping result is obtained, and the position change of the bottom edge of the side when the target cropping result is obtained is determined as the offset.
  • the above method further includes: determining the circumscribed frame corresponding to the above-mentioned side top edge as the above-mentioned preset frame; or, a combined edge obtained by combining a plurality of side top edges included in the above-mentioned roof outline
  • the corresponding circumscribed frame is determined to be the above-mentioned preset frame.
  • image processing is performed on the target image to determine the roof area of the building, the side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building , including: utilize image processing model to carry out image processing to above-mentioned target image, determine the roof area of above-mentioned building, the side bottom edge and side top edge of above-mentioned building, and the offset angle between the roof and base of above-mentioned building;
  • the above-mentioned image processing model includes a roof area prediction sub-model for outputting the above-mentioned roof area, a building edge prediction sub-model for outputting the above-mentioned side bottom edge and the above-mentioned side top edge, and a building edge direction prediction for outputting the above-mentioned edge direction. submodel, and an offset angle prediction submodel for outputting the above offset angle.
  • the training method of the above image processing model includes: acquiring a plurality of training samples related to buildings and including labeling information; wherein the labeling information includes the roof area and side area of the building, the outline of the building Each edge included, the edge direction corresponding to each pixel included in the building, and the offset angle between the roof and the base of the building; based on the loss information corresponding to each of the sub-models included in the image processing model, construct Joint learning loss information; based on the joint learning loss information, use a plurality of the training samples to jointly train each of the above-mentioned sub-models included in the above-mentioned image processing model until each of the above-mentioned sub-models converges.
  • the present disclosure also provides an image processing device, the device comprising: an acquisition module for acquiring a target image including a building; an image processing module for performing image processing on the target image to determine the roof area of the building, the above-mentioned The side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building; the offset determination module is used to determine the side bottom edge and the side top edge according to the offset angle. The offset between them; the transformation module is used to transform the roof contour corresponding to the roof area according to the offset to obtain the base contour.
  • the above-mentioned apparatus further includes: a building height determination module, configured to determine the height of the above-mentioned building based on the above-mentioned offset amount and a predetermined scale between the building height and the offset amount.
  • a building height determination module configured to determine the height of the above-mentioned building based on the above-mentioned offset amount and a predetermined scale between the building height and the offset amount.
  • the above-mentioned apparatus further includes: an edge direction determination module, configured to perform image processing on the above-mentioned target image, and determine the respective edge directions corresponding to each pixel included in the roof outline of the above-mentioned building; the above-mentioned apparatus further includes : a regularization processing module, which performs regularization processing on the above-mentioned roof outline based on the above-mentioned edge direction, so as to obtain a roof polygon corresponding to the above-mentioned building; the above-mentioned transformation module is specifically used for: according to the above-mentioned offset, the above-mentioned roof polygon is transformed, Obtain the base profile above.
  • an edge direction determination module configured to perform image processing on the above-mentioned target image, and determine the respective edge directions corresponding to each pixel included in the roof outline of the above-mentioned building
  • the above-mentioned apparatus further includes : a regularization processing module, which performs regularization processing on the above-
  • the above-mentioned regularization processing module includes: a first determination sub-module, configured to use any pixel point in each pixel point included in the above-mentioned roof outline as a target pixel point, and determine the corresponding pixel point of the above-mentioned target pixel point.
  • the second determination sub-module is used for the direction difference between the edge direction corresponding to the above-mentioned target pixel point and the edge direction corresponding to the above-mentioned adjacent pixel points
  • the target pixel point is determined as the vertex of the roof polygon corresponding to the building
  • the roof polygon determination sub-module obtains the roof polygon corresponding to the building based on the determined vertex of the roof polygon.
  • the above apparatus further includes: a dividing module. Dividing the preset angle to obtain N angle intervals, and assigning identification values to the above N angle intervals; wherein, N is a positive integer; the above-mentioned first determination sub-module is specifically used for: determining the edge direction corresponding to the above-mentioned target pixel point the first angle interval to which it belongs; determine the second angle interval to which the edge direction corresponding to the adjacent pixel point of the above-mentioned target pixel point belongs; the difference between the identification value corresponding to the above-mentioned first angle interval and the identification value corresponding to the above-mentioned second angle interval It is determined as the direction difference between the edge direction corresponding to the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point.
  • a dividing module Dividing the preset angle to obtain N angle intervals, and assigning identification values to the above N angle intervals; wherein, N is a positive integer; the above-mentioned first determination
  • N is a positive integer less than or equal to the second preset threshold.
  • the above-mentioned apparatus further includes: a vertex correction module, based on the vertex correction model, to correct the vertices of the above-mentioned roof polygon to obtain a corrected roof polygon; wherein, the above-mentioned vertex correction model is determined based on a graph neural network. 's model.
  • the offset determination module includes an offset determination sub-module, configured to determine the position change of the bottom edge of the side surface moving to the top edge of the side surface in the direction of the offset angle, and Let the above-mentioned position change amount be the above-mentioned offset amount.
  • the above-mentioned offset determination sub-module is specifically configured to: based on the preset frame corresponding to the above-mentioned side top edge, crop the side top edge probability map corresponding to the above-mentioned side top edge to obtain a first cropping
  • the above-mentioned side top edge probability map includes the image area in the above-mentioned target image including the above-mentioned side surface top edge; according to the preset step size and the preset maximum offset, the above-mentioned side bottom edge is moved multiple times according to the above-mentioned offset angle direction , and after each movement, the side bottom edge probability map corresponding to the side bottom edge is cropped based on the preset frame to obtain a plurality of second cropping results, where the side bottom edge probability map includes the target image including the side The image area of the bottom edge; in the above-mentioned multiple second cropping results, determine the target cropping result that matches the above-mentioned first cropping result, and determine
  • the above-mentioned device further includes: determining the circumscribed frame corresponding to the above-mentioned side top edge as the above-mentioned preset frame; or, a combined edge obtained by combining a plurality of side top edges included in the above-mentioned roof outline The corresponding circumscribed frame is determined to be the above-mentioned preset frame.
  • the above-mentioned image processing module is specifically configured to: use an image processing model to perform image processing on the above-mentioned target image to determine the roof area of the above-mentioned building, the side bottom edge and the side top edge of the above-mentioned building, and The offset angle between the roof and the base of the above-mentioned building;
  • the above-mentioned image processing model includes a roof area prediction sub-model for outputting the above-mentioned roof area, and a building edge predictor for outputting the above-mentioned side bottom edge and the above-mentioned side top edge.
  • the model is used to output the building edge direction prediction sub-model for the above-mentioned edge direction, and the offset angle prediction sub-model for outputting the above-mentioned offset angle.
  • the training device corresponding to the training method of the image processing model includes: a training sample acquisition module for acquiring a plurality of training samples involving buildings and including label information; wherein the label information includes the buildings The roof area and side area of the building, the edges included in the building outline, the edge directions corresponding to the pixels included in the building, and the offset angle between the roof and the base of the building; the loss information determination module is used for Based on the loss information corresponding to each of the above sub-models included in the above image processing model, joint learning loss information is constructed; the joint training module is configured to, based on the above joint learning loss information, use a plurality of the above training samples to The above-mentioned sub-models are jointly trained until each of the above-mentioned sub-models converges.
  • the present disclosure also proposes an electronic device, the device comprising: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to call the executable instructions stored in the memory to implement any of the above An image processing method shown in an embodiment. .
  • the present disclosure also provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute and implement the image processing method shown in any of the foregoing embodiments.
  • the roof area of the building, the side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building are determined from the acquired target image. Then, according to the offset angle, the offset between the bottom edge of the side surface and the top edge of the side surface is determined. Finally, according to the above offset, the roof contour corresponding to the above roof area is transformed to obtain the base contour, so that in the process of building base prediction, there is no need to rely on the base features included in the target image, so that the building included in the target image does not need to be relied on. In the case where the feature of the base of the building is blocked, the base of the building with higher precision can also be obtained.
  • FIG. 2 is a schematic flow chart of a base profile prediction shown in the present disclosure
  • FIG. 3 is a schematic flowchart of image processing on a target image shown in the present disclosure
  • FIG. 5 is a schematic diagram of an offset between a roof and a base shown in the present disclosure
  • FIG. 6 is a method flow chart of a method for determining an offset shown in the present disclosure
  • FIG. 7 is a schematic diagram of a movement process of a side bottom edge shown in the present disclosure.
  • FIG. 8 is a schematic flowchart of a base profile prediction shown in the present disclosure.
  • FIG. 9 is a schematic diagram of an edge direction shown in the present disclosure.
  • FIG. 10 is a schematic flowchart of image processing on a target image shown in the present disclosure.
  • FIG. 11 is a method flow chart of an image processing model training method shown in the present disclosure.
  • FIG. 12 is a schematic diagram of an image processing apparatus shown in the present disclosure.
  • FIG. 13 is a hardware structure diagram of an electronic device shown in the present disclosure.
  • the present disclosure aims to propose an image processing method.
  • the method predicts the roof area of the building from the above-mentioned target image and relevant features for determining the offset between the roof and the base of the building, and determines the above-mentioned offset according to the above-mentioned relevant characteristics. After the offset is determined, the roof contour corresponding to the roof area is transformed based on the offset to obtain the base contour, so that in the process of building base prediction, there is no need to rely on the base features included in the target image, so that the In the case where the building base features included in the target image are occluded, a building base with higher accuracy can also be obtained.
  • FIG. 1 is a method flowchart of an image processing method shown in the present disclosure. As shown in Figure 1, the above method may include:
  • S104 Perform image processing on the target image to determine the roof area of the building, the side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building.
  • S106 Determine the offset between the bottom edge of the side surface and the top edge of the side surface according to the offset angle.
  • the above-mentioned image processing method can be applied to electronic equipment.
  • the above-mentioned electronic device may execute the above-mentioned image processing method by carrying a software system corresponding to the image processing method.
  • the type of the above-mentioned electronic device may be a notebook computer, a computer, a server, a mobile phone, a PAD terminal, etc., which is not particularly limited in the present disclosure.
  • the above-mentioned image processing method can be executed only by the terminal device or the server device alone, or can be executed by the terminal device and the server device in cooperation.
  • the above-mentioned image processing method can be integrated in the client.
  • the terminal device equipped with the client After receiving the image processing request, the terminal device equipped with the client can provide computing power through its own hardware environment to execute the above image processing method.
  • the above-mentioned image processing method can be integrated into the system platform.
  • the server device equipped with the system platform can provide computing power through its own hardware environment to execute the above image processing method.
  • the above image processing method can be divided into two tasks: acquiring a target image and processing the target image.
  • the acquisition task can be integrated in the client and carried on the terminal device.
  • Processing tasks can be integrated on the server and carried on the server device.
  • the above terminal device may initiate an image processing request to the above server device after acquiring the target image.
  • the server device may execute the method on the target image in response to the request.
  • the execution subject is an electronic device (hereinafter referred to as a device) as an example for description.
  • FIG. 2 is a schematic diagram of a base profile prediction process shown in the present disclosure.
  • the above-mentioned device may execute S104, perform image processing on the above-mentioned target image, and determine the roof area of the above-mentioned building, the side bottom edge and the side top edge of the above-mentioned building, and the above-mentioned building.
  • the offset angle between the roof and the base may be executed S104, perform image processing on the above-mentioned target image, and determine the roof area of the above-mentioned building, the side bottom edge and the side top edge of the above-mentioned building, and the above-mentioned building.
  • the above-mentioned target image refers to an image including at least one building in the image.
  • the above-mentioned target image may be a remote sensing image captured by a device such as an aircraft, an unmanned aerial vehicle, or a satellite.
  • the above-mentioned device may complete the acquisition of the target image by interacting with the user.
  • the above-mentioned device may provide the user with a window for inputting the target image to be processed through the interface mounted on the device, so that the user can input the image.
  • the user can complete the input of the target image based on this window.
  • the image can be input into the image processing model for calculation.
  • the above-mentioned device can directly acquire the remote sensing image output by the remote sensing image acquisition system.
  • the above-mentioned device may pre-establish a certain protocol with the remote sensing image acquisition system. After the remote sensing image acquisition system generates the remote sensing image, it can be sent to the above-mentioned equipment for image processing.
  • the above-mentioned device may perform image processing on the above-mentioned target image using an image processing model, so as to extract the roof area of the building, the side bottom edge and the side top edge of the above-mentioned building, and the above-mentioned building from the above-mentioned target image.
  • the offset angle between the roof and the base may be determined by the above-mentioned model, so as to extract the roof area of the building, the side bottom edge and the side top edge of the above-mentioned building, and the above-mentioned building from the above-mentioned target image.
  • the above-mentioned image processing model may specifically be an image processing model for predicting the roof area of a building and predicting the above-mentioned related features for the target image.
  • the image processing model may be a pre-trained neural network model.
  • FIG. 3 is a schematic flowchart of image processing on a target image shown in the present disclosure.
  • the above-mentioned image processing model may include three branches, and the three branches may share the same backbone network.
  • the first branch can be used to predict the roof area
  • the second branch can be used to predict the side edge (including the side top edge and the side bottom edge)
  • the third branch can be used to predict the offset angle between the roof and the base of the building.
  • the structure of the image processing model shown in FIG. 3 is only a schematic illustration, and in practical applications, the structure of the model can be built according to the actual situation.
  • the above image processing model may be obtained by training based on a plurality of training samples marked with annotation information.
  • the above-mentioned label information may include the roof area, the top edge of the side of the building and the bottom edge of the side of the building, and the offset angle between the roof and the base of the building.
  • the above-mentioned backbone network is specifically used for feature prediction for the target image.
  • the above-mentioned backbone network may be a feature prediction network such as VGG, ResNet, etc., which is not particularly limited here.
  • VGG feature prediction network
  • ResNet ResNet
  • the above-mentioned branch 1 and branch 2 may specifically be a pixel-level segmentation network.
  • the above-mentioned branch 1 can classify each pixel included in the target image as belonging to a category of roof and background.
  • the target image includes pixel point A. If it is predicted that the pixel point A is a pixel point included in the roof area through the first branch, the pixel point A can be classified as belonging to the roof.
  • the above-mentioned branch 2 can divide each pixel included in the target image into one of the top edge of the side, the bottom edge of the side and the background.
  • the target image includes pixel point B. If it is predicted through the second branch above that the pixel point B is a pixel point included in the top edge of the side surface, the pixel point B can be classified as belonging to the top edge of the side surface.
  • the above-mentioned branch 1 may divide each pixel included in the target image into a category of roof, side, and background.
  • the above-mentioned branch 2 can classify each pixel included in the target image into one of the background, the edge between the roof and the background, the top edge of the side, the bottom edge of the side, and the hypotenuse edge of the left and right sides of the building side.
  • the above-mentioned branch 3 may specifically be an image-level segmentation network.
  • the above branch three can predict the offset angle between the base and the roof of the building included in the target image.
  • the above-mentioned offset angle may refer to the angular offset between the base and the roof.
  • the x-axis and the y-axis may belong to a rectangular coordinate system constructed with the lower left corner of the target image as the coordinate center.
  • the offset angle may be the angle between the side hypotenuse and the vertical direction (eg, may be the angle between the side hypotenuse and the vertical downward direction).
  • the tangent value corresponding to the angle obtained by subtracting 90 degrees from the included angle is the ratio of the change in the y-axis direction of the roof position and the base position to the change in the x-axis direction.
  • FIG. 4 is a schematic diagram of an offset angle shown in the present disclosure.
  • the coordinate system shown in FIG. 4 is a rectangular coordinate system constructed with the lower left corner of the target image as the center of the coordinate circle.
  • the side bevel is the side where the base connects to the roof.
  • the angle ⁇ is the above-mentioned offset angle.
  • the offset angle of each building in the captured target image is approximately the same.
  • the above-mentioned offset angle may also be other angles defined by the developer according to the actual situation, which can indicate the included angle between the base and the roof of the building.
  • the above-mentioned offset angle may also be the included angle between the side hypotenuse and the vertical upward direction or the horizontal rightward direction, and so on.
  • the above-mentioned side bottom edge is usually one of the edges included in the outline of the building base.
  • the above-mentioned side top edge is usually one of the edges included in the roof profile of the building.
  • S106 may be executed to determine the offset between the bottom edge of the side surface and the top edge of the side surface according to the offset angle quantity.
  • the above-mentioned offset may specifically refer to the positional offset between the base and the roof.
  • the above offset transforms the roof profile to the bottom profile.
  • the above offset may be an offset vector. That is, the amount of movement of the roof and the base on the x-axis and y-axis.
  • FIG. 5 is a schematic diagram of an offset between a roof and a base shown in the present disclosure.
  • the coordinate system shown in FIG. 5 is a rectangular coordinate system constructed with the lower left corner of the target image as the center of the coordinate circle.
  • Point P is a point on the base of the building.
  • Point Q is the point corresponding to point P on the roof of the building.
  • the offset (x2-x1, y2-y1) between the coordinates corresponding to the Q point and the coordinates corresponding to the P point is the above offset.
  • the device when determining the offset between the bottom edge of the side surface and the top edge of the side surface according to the offset angle, the device may input the offset angle, the bottom edge of the side surface, and the top edge of the side surface The shift amount determination unit performs calculation to obtain the offset amount between the bottom edge of the side surface and the top edge of the side surface.
  • the above offset determination unit is configured with an offset determination algorithm.
  • the algorithm can determine the position change amount of the bottom edge of the side surface moving to the top edge of the side surface in the direction of the offset angle, and use the position change amount as the offset amount.
  • the algorithm may also determine the position change amount of the top edge of the side surface moving to the bottom edge of the side surface in the direction of the offset angle, and use the position change amount as the offset amount.
  • the offset can be determined by moving the top edge of the side or the bottom edge of the side.
  • the principles of the two methods are the same, and the steps can be referred to each other.
  • the following is an example of determining the offset by moving the bottom edge of the side. Example description.
  • FIG. 6 is a method flowchart of an offset determination method shown in the present disclosure.
  • the above-mentioned device may first execute S602, and based on the preset frame corresponding to the above-mentioned side top edge, crop the side top edge probability map corresponding to the above-mentioned side top edge to obtain a first cropping result.
  • the above-mentioned preset frame may specifically be a preset frame including a side top edge.
  • the target image can be cropped through the frame to obtain the pixels inside the frame.
  • the circumscribed border corresponding to the above-mentioned side top edge may be determined as the above-mentioned preset border.
  • the top edge belonging to the same building obtained through S104 may be discontinuous due to model reasons, so that the preset frame determined according to the top edge of the side cannot well crop the top edge, affecting the The offset determines the accuracy. Therefore, in order to obtain an accurate preset border, the preset border can be determined according to the roof profile and the side top edge together.
  • a circumscribed frame corresponding to the combined edge obtained by combining a plurality of side top edges included in the roof profile may be determined as the above-mentioned preset frame.
  • the preset frame is determined based on a combined edge obtained by combining a plurality of side top edges included in the roof outline, the preset frame may include a complete top edge, so that the first cropped result may include a complete top edge. side top edge, improving offset determination accuracy.
  • the probability map of the side top edge corresponding to the above-mentioned side top edge may be cropped by the preset frame to obtain a first cropping result.
  • the above-mentioned side top edge probability map may specifically be a top edge segmentation map obtained by performing image processing on the target image in S104.
  • the figure includes the side top edge included in the target image above.
  • the probability map is cropped according to the preset frame to obtain the first cropping result, which actually obtains an area including the top edge of the side of the building.
  • S604 may be executed, according to the preset step size and the preset maximum offset, the above-mentioned side bottom edge is moved multiple times according to the above-mentioned offset angle direction, and after each movement, based on the above-mentioned
  • the preset border clips the side bottom edge probability map corresponding to the above-mentioned side bottom edge to obtain a plurality of second clipping results.
  • the above-mentioned preset step size specifically refers to the coordinate value of the bottom edge of the side surface moving along the x-axis direction.
  • the preset step size can be set according to the actual situation. For example, if the resolution of the target image is larger, a larger preset step size can be set; otherwise, a smaller preset step size can be set.
  • the preset step size may also be a coordinate value of the bottom edge of the side surface moving along the y-axis direction, which is not particularly limited here. The following description is given by taking the preset step as an example of moving the bottom edge of the side face along the x-axis by m.
  • the above-mentioned preset maximum offset specifically refers to the maximum value of the movement of the bottom edge of the side surface along the x-axis direction.
  • the preset maximum offset may be set according to the actual situation. For example, if the resolution of the target image is larger, a larger preset maximum offset may be set; otherwise, a smaller preset maximum offset may be set.
  • the preset maximum offset may also be the maximum value of the movement of the bottom edge of the side surface along the y-axis direction, which is not particularly limited herein. The following description will be given by taking an example that the preset maximum offset is that the maximum value of the movement of the bottom edge of the side surface along the x-axis is n (where n is greater than m).
  • the above-mentioned S604 may be executed, and according to the preset step size and the preset maximum offset, the bottom edge of the side surface is moved multiple times in the direction of the offset angle, and every time After the second movement, the bottom edge probability map of the side surface corresponding to the bottom edge of the side surface is cropped based on the preset frame to obtain a plurality of second cropping results.
  • the above-mentioned side bottom edge probability map may specifically be a bottom edge segmentation map obtained by performing image processing on the target image in S104.
  • the figure includes the side bottom edge included in the target image above.
  • the second cropping result is obtained by cropping the probability map according to the preset frame, which actually obtains the area corresponding to the position of the bottom edge of the side of the building in the probability map of the side bottom edge.
  • FIG. 7 is a schematic diagram of a movement process of a side bottom edge shown in the present disclosure.
  • the coordinate system shown in FIG. 7 is a Cartesian coordinate system constructed with the lower left corner of the side bottom edge probability map as the coordinate circle center.
  • DE is the initial position of the bottom edge of the side.
  • FG is an intermediate position of the bottom edge of the side during the movement.
  • HI is the last position of the bottom edge of the side when the movement ends (that is, the position corresponding to the preset maximum offset).
  • the dotted frame is the determined preset frame.
  • the bottom edge of the side starts from the initial position DE, and moves m steps in the x-axis direction and tan( ⁇ -90)*m steps in the y-axis direction each time until it moves to the HI position.
  • the region corresponding to the coordinate position of the preset border on the side bottom edge probability map may be trimmed according to the preset border to obtain a second trimming result.
  • S606 may be executed.
  • a target cropping result matching the above-mentioned first cropping result is determined, and when the above-mentioned target cropping result is obtained, the bottom edge of the side surface is obtained.
  • the position change amount is determined as the above-mentioned offset amount.
  • a distance determination method such as Euclidean distance and Mahalanobis distance can be used to determine the similarity between the first trimming result and the second trimming result, and find the highest similarity from the determined similarity. After the highest similarity is determined, the second trimming result corresponding to the highest similarity is determined as the target trimming result. After the target cropping result is determined, the position change of the bottom edge of the side surface when the target cropping result is obtained may be determined as the offset.
  • the second cropping result obtained by cropping when the bottom edge of the side surface moves to the position FG is the above-mentioned target cropping result.
  • the combination of the changes in the x-axis and the y-axis when the bottom edge of the side surface moves from the position DE to the position FG can be determined as the above offset.
  • the above scheme since the second cropping result obtained by cropping according to the above preset frame is most similar to the first cropping result only when the bottom edge of the side is moved to the position of the top edge of the side, the above scheme is adopted, by dividing the When the bottom edge of the side surface moves to the top edge of the side surface in the direction of the offset angle, the position change of the bottom edge of the side surface is determined as the offset amount, and an accurate offset amount can be determined.
  • S108 may be continued, and according to the offset, the roof contour corresponding to the roof area is transformed to obtain the base contour.
  • the roof profile corresponding to the roof area can be determined first.
  • the outline enclosed by the outermost pixel points corresponding to the roof area is determined as the above-mentioned roof outline.
  • the roof contour corresponding to the roof area may be transformed according to the offset to obtain the base contour.
  • the coordinates corresponding to each pixel included in the roof outline can be translated in the x-axis and y-axis directions by the offset to obtain the coordinates corresponding to each pixel included in the outline of the base, thereby determining the outline of the foundation and completing the prediction of the foundation .
  • the roof area of the building, the side bottom edge and the side top edge of the building, and the offset angle between the roof and the base of the building are determined from the acquired target image. Then, according to the offset angle, the offset between the bottom edge of the side surface and the top edge of the side surface is determined. Finally, according to the above offset, the roof contour corresponding to the above roof area is transformed to obtain the base contour, so as to realize the translation transformation of the roof contour with obvious visual characteristics, obtain the building base contour, and complete the building base prediction.
  • the roof contour in order to improve the prediction accuracy of the building base, after the roof contour is obtained, the roof contour may be regularized.
  • FIG. 8 is a schematic diagram of a base profile prediction process shown in the present disclosure.
  • the edge directions corresponding to each pixel included in the roof outline of the building can also be obtained.
  • the above-mentioned edge direction specifically refers to the normal vector direction of the edge.
  • the above edge direction is usually quantified by the edge direction angle.
  • the above-mentioned edge direction angle may be the angle between the above-mentioned normal vector and the vertical direction (for example, may be the above-mentioned angle between the above-mentioned normal vector and the above-mentioned vertical downward direction).
  • FIG. 9 is a schematic diagram of an edge direction shown in the present disclosure.
  • the direction corresponding to the normal vector LR of the edge JK is the edge direction of the aforementioned edge JK.
  • a vertical downward direction vector LS can be constructed, and the angle ⁇ of the edge direction can be indicated by the included angle ⁇ between the normal vector LR and the direction vector LS. It can be understood that, the edge direction corresponding to the pixel points included in a certain edge is generally consistent with the edge direction corresponding to the edge.
  • the above-mentioned device may divide the preset angle in advance to obtain N angle intervals; wherein, N is a positive integer.
  • the above-mentioned preset angle may be an empirical angle. For example, 360 degrees or 720 degrees, etc.
  • the N in order to reduce the number of edge direction types and expand the span of the angle interval, the N may be a positive integer less than or equal to the second preset threshold.
  • the second preset threshold may be a suitable number of angle intervals determined by experience.
  • the span range of the angle interval is expanded, and the value range of the edge direction type is narrowed, so that when quantizing the edge direction, there is no need to rely too much on the prediction accuracy of the above image processing model. It can improve the prediction accuracy of edge direction, so as to further extract more accurate roof profiles and improve the prediction accuracy of building bases.
  • identification values may be assigned to the above N angle intervals, respectively.
  • the above-mentioned identification value corresponds to the angle interval one-to-one.
  • the number sequence of the angle intervals may be used as the above identification value. For example, when a certain angle interval is the third interval, 3 may be used as the identification value corresponding to the angle interval.
  • the above-mentioned target image can be input into a pre-built edge direction prediction model to predict the edge direction.
  • FIG. 10 is a schematic flowchart of image processing on a target image according to the present disclosure.
  • the above-mentioned image processing model also includes a fourth branch.
  • the fourth branch can be used to predict the edge direction corresponding to each pixel point included in the roof outline.
  • the branch four shares the above-mentioned backbone network with the other three branches.
  • the above image processing model may be obtained by training based on a plurality of training samples marked with annotation information.
  • the above-mentioned label information further includes the edge direction corresponding to each pixel point.
  • each pixel of the original image can be marked with an identification value corresponding to the angle interval to which its edge direction belongs. For example, after the original image is acquired, each pixel of the original image can be traversed, and the identification value corresponding to the angle interval to which the edge direction of each pixel belongs is annotated by labeling software.
  • the above-mentioned roof outline can be regularized based on the above-mentioned edge direction to obtain the roof polygon corresponding to the above-mentioned building.
  • each pixel point in each pixel point included in the above roof outline can be used as the target pixel point. , to determine the direction difference between the edge direction corresponding to the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point.
  • the target pixel point is determined as the vertex of the roof polygon corresponding to the building.
  • the above-mentioned adjacent pixel point may refer to any pixel point in two pixel points adjacent to the target pixel point.
  • the above-mentioned first preset threshold may be a threshold set according to experience. When the direction difference between the edge directions corresponding to two adjacent pixels reaches the threshold, it can indicate that the two pixels do not belong to the same edge. Therefore, according to the above steps, among the pixels included in the roof outline, the pixels belonging to the same edge as the pixels included in the background can be deleted, and the pixels that do not belong to the background are retained, so as to achieve the purpose of regularizing the roof outline.
  • the value of the above-mentioned first preset threshold is also different.
  • the edge orientation is quantified by the edge orientation angle
  • the first preset threshold may be 30.
  • the edge direction is quantified by the identification value corresponding to the angle interval to which the edge direction angle belongs, the first preset threshold may be 3.
  • the target pixel point when determining the direction difference between the edge direction corresponding to the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point, the target pixel point can be determined.
  • the first edge direction angle corresponding to the pixel point, and the second edge direction angle corresponding to the adjacent pixel points of the target pixel point are determined.
  • the difference between the first edge direction angle and the second edge direction angle may be determined as the edge direction corresponding to the target pixel point and the target pixel point The direction difference between the edge directions corresponding to the adjacent pixels of .
  • the first angle interval to which the edge direction corresponding to the target pixel point belongs may be determined , and determine the second angle interval to which the edge directions corresponding to the adjacent pixel points of the target pixel point belong.
  • the difference between the identification value corresponding to the first angle interval and the identification value corresponding to the second angle interval may be determined as the edge direction corresponding to the target pixel point and the above The direction difference between the edge directions corresponding to the adjacent pixels of the target pixel.
  • the roof polygon corresponding to the above-mentioned building can be obtained based on the determined vertices of the roof polygon.
  • the vertices of the roof polygon may be corrected based on the vertex correction model to obtain the corrected roof polygon;
  • the vertex correction model is a model determined based on a graph neural network.
  • the above-mentioned vertex correction models may be two models independent of the image processing model used in the present disclosure, or may be a sub-model (sub-module) of the image processing model, which is not particularly limited here.
  • the above-mentioned vertex correction model can be regarded as a sub-model (sub-module) of the image processing model, and the vertex correction can be performed by using this sub-model (sub-module).
  • the vertex correction model based on the graph neural network is used in this scheme to further correct the roof polygon, a more accurate roof profile can be obtained and the prediction accuracy of the building base can be improved.
  • the above roof polygon can be transformed according to the above offset to obtain the outline of the base.
  • this scheme can predict a more accurate roof outline, thereby improving the prediction accuracy of the building base.
  • LIDAR laser radar
  • DSM digital surface model
  • the building height may also be predicted based on a single-scene target image. Specifically, after the offset amount is determined, based on the offset amount and a predetermined scale between the height of the building and the offset amount, the corresponding building height may be determined as the height of the building.
  • the real heights of some buildings may be acquired in advance, and the offsets corresponding to these buildings determined by using the offset determination method described in the present disclosure. Then, the scale between the building height and the offset is determined based on the above data.
  • the corresponding building height can be determined according to the predicted offset.
  • the corresponding building height may be obtained based on the above offset and a predetermined scale between the building height and the offset. Therefore, when predicting the height of a building, there is no need to rely on remote sensing images, lidar (LIDAR) data, digital surface model (DSM) and other data from multiple scenes and different perspectives, thereby reducing the cost and difficulty of building height prediction.
  • LIDAR lidar
  • DSM digital surface model
  • the image processing model used in the building base prediction scheme may include a roof area prediction sub-model for outputting the above-mentioned roof area, and a building edge prediction sub-model for outputting the above-mentioned side bottom edge and the above-mentioned side top edge, A building edge direction prediction sub-model for outputting the above-mentioned edge direction, and an offset angle prediction sub-model for outputting the above-mentioned offset angle.
  • the multi-task joint training method is adopted when training the image processing model.
  • the roof area and side area of the building, the edges included in the building outline, and the edges included in the building may be introduced. Constraints on the edge direction corresponding to each pixel, the offset angle between the roof and the base, etc.
  • FIG. 11 is a method flowchart of an image processing model training method shown in the present disclosure.
  • the method includes:
  • S1102 acquiring a plurality of training samples involving buildings and including labeling information; wherein the labeling information includes the roof area and side area of the building, each edge included in the building outline, and each pixel included in the building. Edge orientation, and the offset angle between the roof and plinth of the aforementioned building.
  • the original image can be labeled with the labeling information by means of manual labeling or machine-assisted labeling.
  • image annotation software can be used to mark each pixel included in the original image as belonging to the roof, side area or background of the building; mark which edge included in the outline of the building it belongs to ; label its corresponding edge direction; on the other hand, the offset angle between the roof and the base of the building included in this image can be labelled.
  • the training samples can be obtained after completing the above labeling operations for the original image.
  • one-hot encoding or other methods may be used for encoding, and the present disclosure does not limit the specific encoding method.
  • S1104 Construct joint learning loss information based on loss information corresponding to each sub-model included in the image processing model.
  • the corresponding loss information of each sub-model may be determined first.
  • the loss information corresponding to each of the above-mentioned sub-models is the cross-entropy loss information.
  • joint learning loss information may be constructed based on the corresponding loss information of each sub-model included in the above image processing model. For example, the loss information corresponding to each sub-model can be added to obtain the above-mentioned joint learning loss information.
  • a regularization term may also be added to the above-mentioned joint learning loss information, which is not particularly limited here.
  • S1106 may be executed to jointly train each sub-model included in the above-mentioned image processing model by using a plurality of the above-mentioned training samples based on the above-mentioned joint learning loss information, until the above-mentioned sub-models converge.
  • the above-mentioned image processing model can be supervised based on the above-mentioned training samples marked with annotation information.
  • the error between the annotation information and the above calculation results can be evaluated based on the constructed joint learning loss information.
  • the stochastic gradient descent method can be used to determine the descending gradient.
  • the model parameters corresponding to the above image processing model can be updated based on backpropagation. The above process is repeated until the above sub-model models converge.
  • the embodiments of the present disclosure do not specifically limit the conditions for model convergence.
  • the four sub-models included in the image processing can be trained simultaneously, so that the sub-models can constrain each other during the training process. , and can promote each other, so as to improve the convergence efficiency of the image processing model on the one hand; on the other hand, the backbone network shared by each sub-model can predict features that are more beneficial to the prediction of the base area, thereby improving the accuracy of the base prediction.
  • the present disclosure further provides an image processing apparatus.
  • FIG. 12 is a schematic diagram of an image processing apparatus shown in the present disclosure.
  • the above-mentioned apparatus 1200 includes: an acquisition module 1210 for acquiring a target image including a building; an image processing module 1220 for performing image processing on the above-mentioned target image to determine the roof area of the above-mentioned building.
  • the side bottom edge and the side top edge of the object, and the offset angle between the roof and the base of the above-mentioned building; the offset determination module 1230 is used to determine the above-mentioned side bottom edge and the above-mentioned side top edge according to the above-mentioned offset angle.
  • the transformation module 1240 is configured to transform the roof contour corresponding to the roof area according to the offset to obtain the base contour.
  • the above-mentioned apparatus 1200 further includes: a building height determination module, configured to determine the height of the above-mentioned building based on the above-mentioned offset amount and a predetermined scale between the building height and the offset amount.
  • a building height determination module configured to determine the height of the above-mentioned building based on the above-mentioned offset amount and a predetermined scale between the building height and the offset amount.
  • the above-mentioned apparatus 1200 further includes: an edge direction determination module, configured to perform image processing on the above-mentioned target image, and determine the respective edge directions corresponding to each pixel included in the roof outline of the above-mentioned building; the above-mentioned apparatus 1200 It also includes: a regularization processing module, based on the above-mentioned edge direction, performs regular processing on the above-mentioned roof outline, and obtains the roof polygon corresponding to the above-mentioned building; the above-mentioned transformation module 1240 is specifically used for: according to the above-mentioned offset, the above-mentioned roof polygon The transformation is performed to obtain the above-mentioned base outline.
  • an edge direction determination module configured to perform image processing on the above-mentioned target image, and determine the respective edge directions corresponding to each pixel included in the roof outline of the above-mentioned building
  • the above-mentioned apparatus 1200 It also includes: a regularization processing module
  • the above-mentioned regularization processing module includes: a first determination sub-module, configured to use any pixel point among the pixel points included in the above-mentioned roof outline as a target pixel point, and determine the edge corresponding to the above-mentioned target pixel point The direction difference between the direction and the edge direction corresponding to the adjacent pixel points of the above-mentioned target pixel point; the second determination sub-module is used for the direction difference between the edge direction corresponding to the above-mentioned target pixel point and the edge direction corresponding to the above-mentioned adjacent pixel points to reach
  • the target pixel point is determined as the vertex of the roof polygon corresponding to the building;
  • the roof polygon determination sub-module obtains the roof polygon corresponding to the building based on the determined vertex of the roof polygon.
  • the above-mentioned apparatus 1200 further includes: a dividing module. Dividing the preset angle to obtain N angle intervals, and assigning identification values to the above N angle intervals; wherein, N is a positive integer; the above-mentioned first determination sub-module is specifically used for: determining the edge direction corresponding to the above-mentioned target pixel point the first angle interval to which it belongs; determine the second angle interval to which the edge direction corresponding to the adjacent pixel point of the above-mentioned target pixel point belongs; the difference between the identification value corresponding to the above-mentioned first angle interval and the identification value corresponding to the above-mentioned second angle interval It is determined as the direction difference between the edge direction corresponding to the target pixel point and the edge direction corresponding to the adjacent pixel points of the target pixel point.
  • a dividing module Dividing the preset angle to obtain N angle intervals, and assigning identification values to the above N angle intervals; wherein, N is a positive integer; the above
  • N is a positive integer less than or equal to the second preset threshold.
  • the above-mentioned apparatus 1200 further includes: a vertex correction module, based on the vertex correction model, to correct the vertices of the above-mentioned roof polygon to obtain a corrected roof polygon; wherein, the above-mentioned vertex correction model is based on a graph neural network definite model.
  • the offset determination module 1230 includes: an offset determination sub-module, configured to determine the position change of the bottom edge of the side surface moving to the top edge of the side surface in the direction of the offset angle,
  • the above-mentioned position change amount is regarded as the above-mentioned offset amount.
  • the above-mentioned offset determination sub-module is specifically configured to: based on the preset frame corresponding to the above-mentioned side top edge, crop the side top edge probability map corresponding to the above-mentioned side top edge to obtain a first cropping
  • the above-mentioned side top edge probability map includes the image area in the above-mentioned target image including the above-mentioned side surface top edge; according to the preset step size and the preset maximum offset, the above-mentioned side bottom edge is moved multiple times according to the above-mentioned offset angle direction , and after each movement, the side bottom edge probability map corresponding to the side bottom edge is cropped based on the preset frame to obtain a plurality of second cropping results, where the side bottom edge probability map includes the target image including the side The image area of the bottom edge; in the above-mentioned multiple second cropping results, determine the target cropping result that matches the above-mentioned first cropping result, and determine
  • the above-mentioned apparatus 1200 further includes: determining the circumscribed frame corresponding to the above-mentioned side top edge as the above-mentioned preset frame; or, a combination obtained by combining a plurality of side-side top edges included in the above-mentioned roof outline The circumscribed frame corresponding to the edge is determined as the above-mentioned preset frame.
  • the above-mentioned image processing module 1220 is specifically configured to: use an image processing model to perform image processing on the above-mentioned target image, and determine the roof area of the above-mentioned building, the side bottom edge and the side top edge of the above-mentioned building, And the offset angle between the roof and the base of the above-mentioned building;
  • the above-mentioned image processing model includes a roof area prediction sub-model for outputting the above-mentioned roof area, for outputting the building edge prediction of the above-mentioned side bottom edge and the above-mentioned side top edge
  • the training device 1300 corresponding to the training method of the image processing model above includes: a training sample acquisition module 1310 for acquiring a plurality of training samples related to buildings and including labeling information; wherein the labeling information includes The roof area and side area of the above-mentioned building, each edge included in the building outline, the edge direction corresponding to each pixel point included in the above-mentioned building, and the offset angle between the roof and the base of the above-mentioned building; loss information determination module 1320 , for constructing joint learning loss information based on the loss information corresponding to each of the above-mentioned sub-models included in the above-mentioned image processing model; the joint training module 1330 is used for processing the above-mentioned image by using a plurality of above-mentioned training samples based on the above-mentioned joint learning loss information Each of the above-mentioned sub-models included in the model is jointly trained until each of the above-mentioned sub-models
  • an electronic device which may include: a processor.
  • Memory used to store processor-executable instructions.
  • the above-mentioned processor is configured to invoke the executable instructions stored in the above-mentioned memory to implement the image processing method shown in any of the above-mentioned embodiments.
  • FIG. 13 is a hardware structure diagram of an electronic device shown in the present disclosure.
  • the electronic device may include a processor for executing instructions, a network interface for performing network connection, a memory for storing operating data for the processor, and a non-volatile memory for storing instructions corresponding to the image processing apparatus. volatile memory.
  • the embodiment of the image processing apparatus may be implemented by software, or may be implemented by hardware or a combination of software and hardware.
  • a device in a logical sense is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where the device is located.
  • the electronic device where the apparatus is located in the embodiment may also include other Hardware, no further details on this.
  • the corresponding instructions of the image processing apparatus may also be directly stored in the memory, which is not limited herein.
  • the present disclosure provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute and implement the image processing method shown in any of the foregoing embodiments.
  • one or more embodiments of the present disclosure may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present disclosure may employ a computer implemented on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein The form of the program product.
  • computer-usable storage media which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.
  • Embodiments of the subject matter and functional operations described in this disclosure can be implemented in digital electronic circuitry, in tangible embodiment of computer software or firmware, in computer hardware that can include the structures disclosed in this disclosure and their structural equivalents, or in A combination of one or more of.
  • Embodiments of the subject matter described in this disclosure may be implemented as one or more computer programs, ie, one or more of computer program instructions encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. multiple modules.
  • the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for interpretation by the data.
  • the processing device executes.
  • the computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of these.
  • the processes and logic flows described in this disclosure can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output.
  • the processes and logic flows described above can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, eg, an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • a computer suitable for the execution of a computer program may include, for example, a general and/or special purpose microprocessor, or any other type of central processing unit.
  • the central processing unit will receive instructions and data from read only memory and/or random access memory.
  • the basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operably coupled to, such mass storage devices to receive data therefrom or to include one or more mass storage devices, such as magnetic disks, magneto-optical disks, or optical disks, etc., for storing data. Send data to it, or both.
  • the computer does not have to have such a device.
  • the computer may be embedded in another device, such as a mobile phone, personal digital assistant (PDA), mobile audio or video player, game console, global positioning system (GPS) receiver, or a universal serial bus (USB) ) flash drives for portable storage devices, to name a few.
  • PDA personal digital assistant
  • GPS global positioning system
  • USB universal serial bus
  • Computer readable media suitable for storage of computer program instructions and data may include all forms of non-volatile memory, media, and memory devices, and may include, for example, semiconductor memory devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard disks) or removable discs), magneto-optical discs, and CD-ROM and DVD-ROM discs.
  • semiconductor memory devices eg, EPROM, EEPROM, and flash memory devices
  • magnetic disks eg, internal hard disks
  • removable discs removable discs
  • magneto-optical discs e.g., CD-ROM and DVD-ROM discs.
  • the processor and memory may be supplemented by or incorporated in special purpose logic circuitry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本公开提出一种图像处理方法、装置、设备和存储介质。上述方法包括,获取包含建筑物的目标图像。对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角。根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量。根据上述偏移量,对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。

Description

一种图像处理方法、装置、设备和存储介质
相关申请的交叉引用
本公开要求于2020年9月27日提交的、申请号为202011036378.9的中国专利公开的优先权,该中国专利公开的全部内容以引用的方式并入本文中。
技术领域
本公开涉及计算机技术领域,具体涉及一种图像处理方法、装置、设备和存储介质。
背景技术
目前,在图像处理领域中,通常需要将图像中的建筑物轮廓预测出来,用于进行诸如城市规划、地图绘制,建筑物变化检测,居民区管理等活动。而进行建筑物预测中比较重要的任务包括建筑物底座的预测,以及建筑物高度的预测。
可是,由于拍摄的建筑物图像通常为通过卫星或飞机拍摄的非正射遥感图像,因此,图像中建筑物底座可能被部分遮挡,导致其视觉特征并不明显,从而影响建筑物底座的预测精度。
发明内容
有鉴于此,本公开至少公开一种图像处理方法,上述方法包括:获取包含建筑物的目标图像;对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角;根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量;根据上述偏移量,对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
在示出的一些例子中,上述方法还包括:基于上述偏移量以及预先确定的建筑物高度与偏移量之间的比例尺,确定上述建筑物的高度。
在示出的一些例子中,上述方法还包括:对上述目标图像进行图像处理,确定上述建筑物的屋顶轮廓包括的各像素点分别对应的边缘方向;上述方法还包括:基于上述边缘方向,对上述屋顶轮廓进行规则化处理,得到上述建筑物对应的屋顶多边形;上述根据上述偏移量,对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓,包括:根据上述偏移量,对上述屋顶多边形进行变换,得到上述底座轮廓。
在示出的一些例子中,上述基于上述边缘方向,对上述屋顶轮廓进行规则化处理,得到上述建筑物对应的屋顶多边形,包括:将上述屋顶轮廓包括的各像素点中任一像素点作为目标像素点,确定上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差;在上述目标像素点对应的边缘方向与上述相邻像素点对应的边缘方向之方向差达到第一预设阈值的情况下,将上述目标像素点确定为上述建筑物对应的屋顶多边形的顶点;基于确定的屋顶多边形的顶点,得到上述建筑物对应的屋顶多边形。
在示出的一些例子中,上述方法还包括:将预设角度进行划分得到N个角度区间,并为上述N个角度区间分配标识值;其中,N为正整数;上述确定上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差,包括:确定上述目标像素点对应的边缘方向所属的第一角度区间;确定上述目标像素点的相邻像素点对应的边缘方向所属的第二角度区间;将上述第一角度区间对应的标识值与上述第二角度区间对应的标识值之差确定为上述目标像素点对应的边缘方向与上述目标像素点的相 邻像素点对应的边缘方向之方向差。
在示出的一些例子中,N为小于等于第二预设阈值的正整数。
在示出的一些例子中,上述方法还包括:基于顶点修正模型,对上述屋顶多边形的顶点进行修正,得到修正后的屋顶多边形;其中,上述顶点修正模型为基于图神经网络确定的模型。
在示出的一些例子中,上述根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量,包括:确定上述侧面底部边缘按照上述偏移角的方向移动至上述侧面顶部边缘的位置变化量,并将上述位置变化量作为上述偏移量。
在示出的一些例子中,上述确定上述侧面底部边缘按照上述偏移角的方向移动至上述侧面顶部边缘的位置变化量,并将上述位置变化量作为上述偏移量,包括:基于与上述侧面顶部边缘对应的预设边框,对上述侧面顶部边缘对应的侧面顶部边缘概率图进行裁剪,得到第一裁剪结果,上述侧面顶部边缘概率图包括上述目标图像中包含上述侧面顶部边缘的图像区域;按照预设步长以及预设最大偏移量,将上述侧面底部边缘按照上述偏移角方向进行多次移动,并在每次移动后,基于上述预设边框对上述侧面底部边缘对应的侧面底部边缘概率图进行裁剪,得到多个第二裁剪结果,上述侧面底部边缘概率图包括上述目标图像中包含上述侧面底部边缘的图像区域;在上述多个第二裁剪结果中,确定与上述第一裁剪结果匹配的目标裁剪结果,并将得到上述目标裁剪结果时上述侧面底部边缘的位置变化量确定为上述偏移量。
在示出的一些例子中,上述方法还包括:将上述侧面顶部边缘对应的外切边框,确定为上述预设边框;或,将上述屋顶轮廓包括的多个侧面顶部边缘进行组合得到的组合边缘所对应的外切边框,确定为上述预设边框。
在示出的一些例子中,对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角,包括:利用图像处理模型对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角;上述图像处理模型包括用于输出上述屋顶区域的屋顶区域预测子模型,用于输出上述侧面底部边缘与上述侧面顶部边缘的建筑物边缘预测子模型,用于输出上述边缘方向的建筑物边缘方向预测子模型,以及用于输出上述偏移角的偏移角预测子模型。
在示出的一些例子中,上述图像处理模型的训练方法包括:获取多个涉及建筑物并包括标注信息的训练样本;其中,上述标注信息包括上述建筑物的屋顶区域与侧面区域,建筑物轮廓包括的各边缘、上述建筑物包括的各像素点对应的边缘方向以及上述建筑物的屋顶与底座之间的偏移角;基于上述图像处理模型包括的各上述子模型分别对应的损失信息,构建联合学习损失信息;基于上述联合学习损失信息,利用多个上述训练样本对上述图像处理模型包括的各上述子模型进行联合训练,直至各上述子模型收敛。
本公开还提出一种图像处理装置,上述装置包括:获取模块,用于获取包含建筑物的目标图像;图像处理模块,用于对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角;偏移量确定模块,用于根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量;变换模块,用于根据上述偏移量,对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
在示出的一些例子中,上述装置还包括:建筑物高度确定模块,用于基于上述偏移量以及预先确定的建筑物高度与偏移量之间的比例尺,确定上述建筑物的高度。
在示出的一些例子中,上述装置还包括:边缘方向确定模块,用于对上述目标图像进行图像处理,确定上述建筑物的屋顶轮廓包括的各像素点分别对应的边缘方向;上述装置还包括:规则化处理模块,基于上述边缘方向,对上述屋顶轮廓进行规则化处理,得到上述建筑物对应的屋顶多边形;上述变换模块,具体用于:根据上述偏移量,对上述屋顶多边形进行变换,得到上述底座轮廓。
在示出的一些例子中,上述规则化处理模块,包括:第一确定子模块,用于将上述屋顶轮廓包括的各像素点中任一像素点作为目标像素点,确定上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差;第二确定子模块,用于在上述目标像素点对应的边缘方向与上述相邻像素点对应的边缘方向之方向差达到第一预设阈值的情况下,将上述目标像素点确定为上述建筑物对应的屋顶多边形的顶点;屋顶多边形确定子模块,基于确定的屋顶多边形的顶点,得到上述建筑物对应的屋顶多边形。
在示出的一些例子中,上述装置还包括:划分模块。将预设角度进行划分得到N个角度区间,并为上述N个角度区间分配标识值;其中,N为正整数;上述第一确定子模块,具体用于:确定上述目标像素点对应的边缘方向所属的第一角度区间;确定上述目标像素点的相邻像素点对应的边缘方向所属的第二角度区间;将上述第一角度区间对应的标识值与上述第二角度区间对应的标识值之差确定为上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差。
在示出的一些例子中,N为小于等于第二预设阈值的正整数。
在示出的一些例子中,上述装置还包括:顶点修正模块,基于顶点修正模型,对上述屋顶多边形的顶点进行修正,得到修正后的屋顶多边形;其中,上述顶点修正模型为基于图神经网络确定的模型。
在示出的一些例子中,上述偏移量确定模块,包括:偏移量确定子模块,用于确定上述侧面底部边缘按照上述偏移角的方向移动至上述侧面顶部边缘的位置变化量,并将上述位置变化量作为上述偏移量。
在示出的一些例子中,上述偏移确定子模块,具体用于:基于与上述侧面顶部边缘对应的预设边框,对上述侧面顶部边缘对应的侧面顶部边缘概率图进行裁剪,得到第一裁剪结果,上述侧面顶部边缘概率图包括上述目标图像中包含上述侧面顶部边缘的图像区域;按照预设步长以及预设最大偏移量,将上述侧面底部边缘按照上述偏移角方向进行多次移动,并在每次移动后,基于上述预设边框对上述侧面底部边缘对应的侧面底部边缘概率图进行裁剪,得到多个第二裁剪结果,上述侧面底部边缘概率图包括上述目标图像中包含上述侧面底部边缘的图像区域;在上述多个第二裁剪结果中,确定与上述第一裁剪结果匹配的目标裁剪结果,并将得到上述目标裁剪结果时上述侧面底部边缘的位置变化量确定为上述偏移量。
在示出的一些例子中,上述装置还包括:将上述侧面顶部边缘对应的外切边框,确定为上述预设边框;或,将上述屋顶轮廓包括的多个侧面顶部边缘进行组合得到的组合边缘所对应的外切边框,确定为上述预设边框。
在示出的一些例子中,上述图像处理模块,具体用于:利用图像处理模型对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角;上述图像处理模型包括用于输出上述屋顶区域的屋顶区域预测子模型,用于输出上述侧面底部边缘与上述侧面顶部边缘的建筑物边缘预测子模型,用于输出上述边缘方向的建筑物边缘方向预测子模型,以及用于输出上述偏移角的偏移角预测子模型。
在示出的一些例子中,上述图像处理模型的训练方法对应的训练装置包括:训练样本获取模块,用于获取多个涉及建筑物并包括标注信息的训练样本;其中,上述标注信息包括上述建筑物的屋顶区域与侧面区域,建筑物轮廓包括的各边缘、上述建筑物包括的各像素点对应的边缘方向以及上述建筑物的屋顶与底座之间的偏移角;损失信息确定模块,用于基于上述图像处理模型包括的各上述子模型分别对应的损失信息,构建联合学习损失信息;联合训练模块,用于基于上述联合学习损失信息,利用多个上述训练样本对上述图像处理模型包括的各上述子模型进行联合训练,直至各上述子模型收敛。
本公开还提出一种电子设备,上述设备包括:处理器;用于存储上述处理器可执行指令的存储器;其中,上述处理器被配置为调用上述存储器中存储的可执行指令,实现如上述任一实施例示出的图像处理方法。.
本公开还提出一种计算机可读存储介质,上述存储介质存储有计算机程序,上述计算机程序用于执行实现如上述任一实施例示出的图像处理方法。
在上述方案中,通过从获取的目标图像中确定出建筑物的屋顶区域、建筑物的侧面底部边缘与侧面顶部边缘、以及建筑物的屋顶与底座之间的偏移角。然后再根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量。最后根据上述偏移量,对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓,从而在建筑物底座预测过程中,无需依赖目标图像中包括的底座特征,以使在目标图像中包括的建筑物底座特征被遮挡的情形下,也可以得到精度较高的建筑物底座。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
为了更清楚地说明本公开一个或多个实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开一个或多个实施例中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开示出的一种图像处理方法的方法流程图;
图2为本公开示出的一种底座轮廓预测流程示意图;
图3为本公开示出的一种对目标图像进行图像处理的流程示意图;
图4为本公开示出的一种偏移角示意图;
图5为本公开示出的一种屋顶与底座偏移量的示意图;
图6为本公开示出的一种偏移量确定方法的方法流程图;
图7为本公开示出的一种侧面底部边缘移动过程的示意图;
图8为本公开示出的一种底座轮廓预测流程示意图;
图9为本公开示出的一种边缘方向示意图;
图10为本公开示出的一种对目标图像进行图像处理的流程示意图;
图11为本公开示出的一种图像处理模型训练方法的方法流程图;
图12为本公开示出的一种图像处理装置的示意图;
图13为本公开示出的一种电子设备的硬件结构图。
具体实施方式
下面将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的设备和方法的例子。
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所使用的单数形式的“一种”、“上述”和“该”也旨在可以包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。还应当理解,本文中所使用的词语“如果”,取决于语境,可以被解释成为“在……时”或“当……时”或“响应于确定”。
本公开旨在提出一种图像处理方法。该方法通过从上述目标图像中预测出建筑物的屋顶区域、以及用于确定上述建筑物屋顶与底座之间的偏移量的相关特征,并根据上述相关特征确定上述偏移量。在确定上述偏移量后,再基于上述偏移量对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓,从而在建筑物底座预测过程中,无需依赖目标图像中包括的底座特征,以使在目标图像中包括的建筑物底座特征被遮挡的情形下,也可以得到精度较高的建筑物底座。
请参见图1,图1为本公开示出的一种图像处理方法的方法流程图。如图1所示,上述方法可以包括:
S102,获取包含建筑物的目标图像。
S104,对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角。
S106,根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量。
S108,根据上述偏移量,对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
上述图像处理方法可以应用于电子设备中。其中,上述电子设备可以通过搭载与图像处理方法对应的软件系统执行上述图像处理方法。本公开实施例中,上述电子设备的类型可以是笔记本电脑,计算机,服务器,手机,PAD终端等,在本公开中不作特别限定。
可以理解的是,上述图像处理方法既可以仅通过终端设备或服务端设备单独执行,也可以通过终端设备与服务端设备配合执行。
例如,上述图像处理方法可以集成于客户端。搭载该客户端的终端设备在接收到图像处理请求后,可以通过自身硬件环境提供算力执行上述图像处理方法。
又例如,上述图像处理方法可以集成于系统平台。搭载该系统平台的服务端设备在接收到图像处理请求后,可以通过自身硬件环境提供算力执行上述图像处理方法。
还例如,上述图像处理方法可以分为获取目标图像与对目标图像进行处理两个任务。其中,获取任务可以集成于客户端并搭载于终端设备。处理任务可以集成于服务端并搭载于服务端设备。上述终端设备可以在获取到目标图像后向上述服务端设备发起图像处理请求。上述服务端设备在接收到上述图像处理请求后,可以响应于上述请求对上述目标图像执行上述方法。
以下以执行主体为电子设备(以下简称设备)为例进行说明。
请参见图2,图2为本公开示出的一种底座轮廓预测流程示意图。
如图2所示,在获取目标图像后,上述设备可以执行S104,对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角。
上述目标图像是指图像中包括至少一个建筑物的图像。例如,上述目标图像可以是通过飞机、无人机、卫星等设备拍摄的遥感图像。
在一种情形中,在获取目标图像时,上述设备可以通过与用户进行交互,完成目标图像的获取。例如,上述设备可以通过其搭载的界面为用户提供输入待处理目标图像的窗口,供用户输入图像。用户可以基于该窗口完成目标图像的输入。上述设备在获取到目标图像后,可以将该图像输入图像处理模型中进行计算。
在另一种情形中,上述设备可以直接获取遥感图像采集系统输出的遥感图像。例如,上述设备可以与遥感图像采集系统预先建立某种协议。当遥感图像采集系统生成遥感图像后可以发送至上述设备进行图像处理。
在一些例子中,上述设备可以利用图像处理模型对上述目标图像进行图像处理,以从上述目标图像中提取出建筑物的屋顶区域、上述建筑物的侧面底部边缘与侧面顶部边缘、以及上述建筑物的屋顶与底座之间的偏移角。
上述图像处理模型,具体可以是针对目标图像进行建筑物屋顶区域预测,以及上述相关特征预测的图像处理模型。在实际应用中,该图像处理模型可以是预先训练完毕的神经网络模型。
请参见图3,图3为本公开示出的一种对目标图像进行图像处理的流程示意图。如图3所示,上述图像处理模型可以包括三个分支,该三个分支可以共用同一骨干网络。其中,分支一可以用于预测屋顶区域,分支二可以用于预测侧面边缘(包括侧面顶部边缘以及侧面底部边缘),分支三可以用于预测建筑物的屋顶与底座之间的偏移角。本公开实施例中,图3中示出的图像处理模型的结构仅为示意性说明,在实际应用中可以根据实际情形搭建该模型的结构。
上述图像处理模型可以是基于多个标注了标注信息的训练样本进行训练得到的。其中,上述标注信息可以包括屋顶区域,建筑物侧面顶部边缘与建筑物侧面底部边缘,以及建筑物的屋顶与底座之间的偏移角。
上述骨干网络,具体用于针对目标图像进行特征预测。例如,上述骨干网络可以是VGG,ResNet等特征预测网络,在此不作特别限定。在通过骨干网络预测出与目标图像对应的目标特征图后,可以将上述目标特征图分别输入上述三个分支中,进行进一步的预测。
上述分支一与分支二,具体可以是像素点级的分割网络。上述分支一可以将目标图像中包括的每一像素点划分为属于屋顶、背景中的一类。例如,目标图像包括像素点A。如果通过上述分支一预测出该像素点A为屋顶区域包括的像素点,即可以将像素点A划分为属于屋顶。
上述分支二可以将目标图像中包括的每一像素点划分为属于侧面顶部边缘,侧面底部边缘和背景中的一类。例如,目标图像包括像素点B。如果通过上述分支二预测出该像素点B为侧面顶部边缘包括的像素点,即可以将像素点B划分为属于侧面顶部边缘。在一种可行的实现方式中,上述分支一可以将目标图像中包括的每一像素点划分为属于屋顶、侧面、背景中的一类。上述分支二可以将目标图像中包括的每一像素点划分为属于背景、屋顶和背景之间的边缘、侧面顶部边缘、侧面底部边缘、建筑物侧面左右两侧斜边边缘中的一类。
上述分支三,具体可以是图像级的分割网络。上述分支三可以预测出目标图像中包括的建筑物的底座与屋顶之间的偏移角。其中,上述偏移角可以是指底座与屋顶之间的角度偏移。通过该偏移角,可以确定屋顶位置与底座位置在y轴方向上的变化与在x轴方向变化的关系。本公开实施例中,x轴、y轴可以属于以目标图像左下角为坐标圆心构建的直角坐标系。
示例性的,该偏移角可以是侧面斜边与竖直方向的夹角(例如,可以是侧面斜边与竖直向下方向之间的夹角)。该夹角减去90度后得到的角度对应的正切值,即为屋顶位置与底座位置在y轴方向上的变化量与在x轴方向上变化量的比值。
请参见图4,图4为本公开示出的一种偏移角示意图。
图4示出的坐标系为以目标图像左下角为坐标圆心构建的直角坐标系。侧面斜边为底座与屋顶连接的一条边。角α即为上述偏移角。
本公开实施例中,一方面,由于上述目标图像通常为距离建筑物较远的卫星或飞机拍摄的遥感图像,因此,拍摄的目标图像中各建筑物的上述偏移角大致相同。另一方面,上述偏移角也可以是由开发人员根据实际情形定义的、能够指示建筑物的底座与屋顶之间的夹角的其他角。例如,上述偏移角也可以是以侧面斜边与竖直向上方向或水平向右方向的夹角等等。
可以理解的是,上述侧面底部边缘通常为建筑物底座轮廓包括的边缘之一。上述侧面顶部边缘通常为建筑物屋顶轮廓包括的边缘之一。
请继续参见图2,在确定偏移角,建筑物侧面顶部边缘与建筑物侧面底部边缘之后,可以执行S106,根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量。
上述偏移量,具体可以是指底座与屋顶之间的位置偏移。通过上述偏移量,可以将屋顶轮廓变换至底部轮廓。
示例性的,上述偏移量可以是偏移向量。即,屋顶与底座在x轴和y轴上的移动量。
请参见图5,图5为本公开示出的一种屋顶与底座偏移量的示意图。
图5示出的坐标系为以目标图像左下角为坐标圆心构建的直角坐标系。点P为建筑物底座上的一点。点Q为该建筑物屋顶与点P对应的一点。Q点对应的坐标与P点对应的坐标之间的偏移(x2-x1,y2-y1)即为上述偏移量。
在本步骤中,在根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量时,上述设备可以将上述偏移角,上述侧面底部边缘与上述侧面顶部边缘输入偏移量确定单元进行计算,得到上述侧面底部边缘与上述侧面顶部边缘之间的偏移量。
其中,上述偏移量确定单元配置有偏移量确定算法。该算法可以确定上述侧面底部边缘按照上述偏移角的方向移动至上述侧面顶部边缘的位置变化量,并将上述位置变化量作为上述偏移量。
本公开实施例中,上述算法也可以确定上述侧面顶部边缘按照上述偏移角的方向移动至上述侧面底部边缘的位置变化量,并将上述位置变化量作为上述偏移量。可以理解的是,不论是通过移动侧面顶部边缘或侧面底部边缘均可以实现确定该偏移量,两种方法原理相同,步骤可以相互参照,以下以通过移动侧面底部边缘确定偏移量为例进行实施例说明。
请参见图6,图6为本公开示出的一种偏移量确定方法的方法流程图。
如图6所示,上述设备可以先执行S602,基于与上述侧面顶部边缘对应的预设边框, 对上述侧面顶部边缘对应的侧面顶部边缘概率图进行裁剪,得到第一裁剪结果。
上述预设边框具体可以是包括侧面顶部边缘的预设边框。通过该边框可以对目标图像进行裁剪,得到该边框内部的像素点。在一些实施例中,在通过S104得到侧面顶部边缘后,可以将上述侧面顶部边缘对应的外切边框,确定为上述预设边框。
在一些实施例中,可能由于模型原因导致通过S104得到的隶属于同一建筑物的顶部边缘可能是间断的,从而导致根据侧面顶部边缘确定的预设边框不能很好的将顶部边缘裁剪出来,影响偏移量确定准确性。因此,为了获得准确的预设边框,可以根据屋顶轮廓与侧面顶部边缘共同确定该预设边框。
具体的,在确定预设边框时,可以将上述屋顶轮廓包括的多个侧面顶部边缘进行组合得到的组合边缘所对应的外切边框,确定为上述预设边框。
由于该预设边框是基于对屋顶轮廓包括的多个侧面顶部边缘进行组合得到的组合边缘确定的,因此该预设边框可以包括完整的顶部边缘,从而使裁剪出的第一裁剪结果可以包括完整的侧面顶部边缘,提升偏移量确定准确性。
在确定预设边框后,可以通过该预设边框对上述侧面顶部边缘对应的侧面顶部边缘概率图进行裁剪,得到第一裁剪结果。
其中,上述侧面顶部边缘概率图,具体可以是通过S104对目标图像进行图像处理后得到的顶部边缘分割图。该图中包括了上述目标图像中包括的侧面顶部边缘。根据预设边框对该概率图进行裁剪得到了第一裁剪结果,实际上即得到了包括建筑物侧面顶部边缘的区域。
在得到第一裁剪结果后,可以执行S604,按照预设步长以及预设最大偏移量,将上述侧面底部边缘按照上述偏移角方向进行多次移动,并在每次移动后,基于上述预设边框对上述侧面底部边缘对应的侧面底部边缘概率图进行裁剪,得到多个第二裁剪结果。
在执行本步骤前,先指定预设步长的大小与预设最大偏移量。
其中,上述预设步长具体是指侧面底部边缘沿着x轴方向移动的坐标值。该预设步长可以是根据实际情形进行设定的。例如,目标图像分辨率较大,则可以设定较大的预设步长;反之,则可以设定较小的预设步长。本公开实施例中,预设步长也可以是侧面底部边缘沿着y轴方向移动的坐标值,在此不作特别限定。以下以预设步长为侧面底部边缘沿着x轴移动m为例进行说明。
上述预设最大偏移量,具体是指侧面底部边缘沿着x轴方向移动的最大值。该预设最大偏移量可以是根据实际情形进行设定的。例如,目标图像分辨率较大,则可以设定较大的预设最大偏移量;反之,则可以设定较小的预设最大偏移量。本公开实施例中,预设最大偏移量也可以是侧面底部边缘沿着y轴方向移动的最大值,在此不作特别限定。以下以预设最大偏移量为侧面底部边缘沿着x轴移动的最大值为n(其中,n大于m)为例进行说明。
在确定预设步长与最大偏移量后,可以执行上述S604,按照预设步长以及预设最大偏移量,将上述侧面底部边缘按照上述偏移角方向进行多次移动,并在每次移动后,基于上述预设边框对上述侧面底部边缘对应的侧面底部边缘概率图进行裁剪,得到多个第二裁剪结果。
上述侧面底部边缘概率图,具体可以是通过S104对目标图像进行图像处理后得到的底部边缘分割图。该图中包括了上述目标图像中包括的侧面底部边缘。根据预设边框对该概率图进行裁剪得到了第二裁剪结果,实际上即得到了侧面底部边缘概率图中建筑物侧面底部边缘位置对应的区域。
请参见图7,图7为本公开示出的一种侧面底部边缘移动过程的示意图。
如图7所示,图7示出的坐标系为以侧面底部边缘概率图左下角为坐标圆心构建的直角坐标系。其中,DE为侧面底部边缘的初始位置。FG为侧面底部边缘在移动过程中的某一中间位置。HI为侧面底部边缘在结束移动时的最后位置(即预设最大偏移量对应的位置)。虚线框为确定的预设边框。
侧面底部边缘从初始位置DE开始,每次向x轴方向移动m步,向y轴方向移动tan(α-90)*m步,直至移动到HI位置。每移动一次之后,可以根据上述预设边框对侧面底部边缘概率图上与预设边框的坐标位置对应的区域进行裁剪,得到一个第二裁剪结果。由此,在多次移动后,可以得到多个第二裁剪结果。
在得到多个第二裁剪结果后,可以执行S606,在上述多个第二裁剪结果中,确定与上述第一裁剪结果匹配的目标裁剪结果,并将得到上述目标裁剪结果时上述侧面底部边缘的位置变化量确定为上述偏移量。
在本步骤中,可以采用诸如欧式距离、马氏距离等距离确定法确定第一裁剪结果与第二裁剪结果之间的相似度,并从确定的相似度中找出最高相似度。在确定最高相似度后将该最高相似度对应的第二裁剪结果确定为上述目标裁剪结果。在确定目标裁剪结果后,可以将得到上述目标裁剪结果时上述侧面底部边缘的位置变化量确定为上述偏移量。
例如,假设侧面底部边缘移动到位置FG时裁剪得到的第二裁剪结果为上述目标裁剪结果。则此时可以将侧面底部边缘从位置DE移动到位置FG时在x轴与y轴的变化量组合确定为上述偏移量。
在上述方案中,由于只有当侧面底部边缘移动到侧面顶部边缘的位置时,根据上述预设边框裁剪得到的第二裁剪结果才会与第一裁剪结果最相似,因此,采用上述方案,通过将上述侧面底部边缘按照上述偏移角方向移动至上述侧面顶部边缘时,上述侧面底部边缘的位置变化量确定为上述偏移量,可以实现确定出准确的偏移量。
请继续参见图2,在确定偏移量之后,可以继续执行S108,根据上述偏移量,对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
在本步骤中,可以先确定屋顶区域对应的屋顶轮廓。例如,通过将屋顶区域对应的最外围的像素点围成的轮廓确定为上述屋顶轮廓。
在确定屋顶轮廓之后,可以根据上述偏移量对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
具体地,可以通过偏移量对屋顶轮廓包括的各像素点对应的坐标进行x轴与y轴方向的平移变换,得到底座轮廓包括的各像素点对应的坐标,从而确定底座轮廓,完成底座预测。
在上述方案中,通过从获取的目标图像中确定出建筑物的屋顶区域、上述建筑物的侧面底部边缘与侧面顶部边缘、以及上述建筑物的屋顶与底座之间的偏移角。然后再根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量。最后根据上述偏移量,对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓,从而实现对视觉特征较明显的屋顶轮廓进行平移变换,得到建筑物底座轮廓,完成建筑物底座预测。
在一些实施例中,为了提升建筑物底座预测精度,在得到屋顶轮廓之后,可以对屋顶轮廓进行规则化处理。
请参见图8,图8为本公开示出的一种底座轮廓预测流程示意图。
如图8所示,在执行S104对目标图像进行处理时,还可以得到上述建筑物的屋顶轮廓包括的各像素点分别对应的边缘方向。
上述边缘方向,具体是指边缘的法向量方向。在实际应用中,通常通过边缘方向角来量化上述边缘方向。其中,上述边缘方向角可以是上述法向量与竖直方向之间的夹角(例如,可以是上述法向量与上述竖直向下方向之间的夹角)。
具体请参见图9,图9为本公开示出的一种边缘方向示意图。如图9所示,边缘JK的法向量LR所对应的方向即为上述边缘JK的边缘方向。在一些例子中,可以通过构造一个竖直向下的方向向量LS,通过法向量LR与方向向量LS之间的夹角β来指示边缘方向的角度。可以理解的是,某一条边缘包括的像素点所对应的边缘方向通常与该边缘对应的边缘方向一致。
在一些例子中,上述设备可以预先将预设角度进行划分得到N个角度区间;其中,N为正整数。本公开实施例中,上述预设角度可以是经验角度。例如,360度或720度等。在一些实施例中,为了缩小边缘方向类型的数量,扩大角度区间的跨度范围,该N可以是小于等于第二预设阈值的正整数。其中第二预设阈值可以是根据经验确定的比较适合的角度区间数量。
由于N的数值小于等于第二预设阈值,因此扩大了角度区间的跨度范围,缩小了边缘方向类型的取值范围,从而在量化边缘方向时,无需过度依赖上述图像处理模型的预测精度,也可以提升边缘方向预测精准度,从而进一步提取更为精准的屋顶轮廓,提升建筑物底座的预测精度。
在确定N后,可以分别为上述N个角度区间分配标识值。其中,上述标识值与角度区间一一对应。在一些例子中,可以将角度区间的编号顺序作为上述标识值。例如,当某一角度区间为第3个区间时,则可以将3作为该角度区间对应的标识值。
在确定目标图像包括的建筑物的各边缘对应的边缘方向时,可以将上述目标图像输入预先构建的边缘方向预测模型中进行边缘方向的预测。
请参见图10,图10为本公开示出的一种对目标图像进行图像处理的流程示意图。
如图10所示,上述图像处理模型除了包括上述三个分支外,还包括分支四。其中,该分支四可以用于预测屋顶轮廓包括的各像素点所对应的边缘方向。该分支四与其他三个分支共用上述骨干网络。
上述图像处理模型可以是基于多个标注了标注信息的训练样本进行训练得到的。其中,上述标注信息还包括每一像素点对应的边缘方向。在构建上述训练样本时,可以针对原始图像的每一个像素点标注其边缘方向所属的角度区间对应的标识值。例如,在获取原始图像后,可以遍历原始图像的每一像素点,并通过标注软件对每一像素点的边缘方向所属角度区间对应的标识值进行标注。
请继续参见图8,在得到边缘方向后,可以基于上述边缘方向,对上述屋顶轮廓进行规则化处理,得到上述建筑物对应的屋顶多边形。
在实际应用中,在基于上述边缘方向,对上述屋顶轮廓进行规则化处理,得到上述建筑物对应的屋顶多边形时,可以分别将上述屋顶轮廓包括的各像素点中每个像素点作为目标像素点,确定上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差。
在上述目标像素点对应的边缘方向与上述相邻像素点对应的边缘方向之方向差达到第一预设阈值的情况下,将上述目标像素点确定为上述建筑物对应的屋顶多边形的顶点。
上述相邻像素点可以指与目标像素点相邻的两个像素点中的任一像素点。上述第一预设阈值,可以是根据经验设定的阈值。相邻的两个像素点对应的边缘方向之方向差达到该阈值可以说明这两个像素点不属于同一边缘。因此,根据上述步骤可以将屋顶轮廓包括的各像素点中,与背景包括的像素点属于同一边缘的像素点删除,保留不属于背景的像素点,从而达到对屋顶轮廓规则化处理的目的。
可以理解的是,通过不同方式量化边缘方向,上述第一预设阈值的取值也有所不同。例如,通过边缘方向角量化边缘方向,那么第一预设阈值可能是30。而通过边缘方向角所属角度区间对应的标识值量化边缘方向时,第一预设阈值可能是3。
在一些例子中,在通过边缘方向角量化边缘方向的情况下,在确定上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差时,可以确定上述目标像素点对应的第一边缘方向角,以及确定上述目标像素点的相邻像素点对应的第二边缘方向角。
在确定上述第一边缘方向角与上述第二边缘方向角之后,可以将上述第一边缘方向角与上述第二边缘方向角之差,确定为上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差。
在一些例子中,在确定上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差时,可以确定上述目标像素点对应的边缘方向所属的第一角度区间,以及确定上述目标像素点的相邻像素点对应的边缘方向所属的第二角度区间。
在确定上述第一角度区间以及上述第二角度区间之后,可以将上述第一角度区间对应的标识值与上述第二角度区间对应的标识值之差确定为上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差。
在针对上述屋顶轮廓包括的所有目标像素点执行完上述步骤后,可以基于确定的屋顶多边形的顶点,得到上述建筑物对应的屋顶多边形。
在一些实施例中,在基于确定的屋顶多边形的顶点,得到上述建筑物对应的屋顶多边形后,可以基于顶点修正模型,对上述屋顶多边形的顶点进行修正,得到修正后的屋顶多边形;其中,上述顶点修正模型为基于图神经网络确定的模型。
可以理解的是,上述顶点修正模型既可以是与本公开使用的图像处理模型相互独立的两个模型,也可以是作为图像处理模型的一个子模型(子模块),在此不作特别限定。当然,为了控制运算量,可以将上述顶点修正模型作为图像处理模型的一个子模型(子模块),并利用该子模型(子模块)进行顶点修正。
由于本方案中使用基于图神经网络构建的顶点修正模型进一步对屋顶多边形进行修正,从而可以得到更为精准的屋顶轮廓,提升建筑物底座的预测精度。
请继续参见图8,在得到屋顶多边形后,可以根据上述偏移量,对上述屋顶多边形进行变换,得到底座轮廓。
由于上述屋顶多边形为对屋顶轮廓进行规则化处理后得到的规则形状,因此,本方案可以预测出更为精准的屋顶轮廓,从而提升建筑物底座的预测精度。
由于预测建筑物高度需要依赖于多景不同视角的遥感图像、激光雷达(LIDAR)数据、数字表面模型(DSM)等数据,而这些数据获取成本较高,获取难度较大,因此,建筑物高度预测难度较大,成本较高。
在一些实施例中,为了通过单景遥感图像进行建筑物高度预测,在确定上述偏 移量之后,还可以基于单景目标图像进行建筑物高度预测。具体的,可以在确定上述偏移量之后,基于上述偏移量,以及预先确定的建筑物高度与偏移量之间比例尺,确定对应的建筑物高度作为上述建筑物的高度。
在实际应用中,可以预先获取一些建筑物的真实高度,以及采用本公开记载的偏移量确定方法确定出的这些建筑物对应的偏移量。然后,再根据上述数据确定建筑物高度与偏移量之间的比例尺。
在确定上述比例尺之后便可根据预测出的偏移量确定对应的建筑物高度。
在上述方案中,在确定上述偏移量之后,可以基于上述偏移量,以及预先确定的建筑物高度与偏移量之间的比例尺,得到对应的建筑物高度。因此,在预测建筑物高度时,可以无需依赖多景不同视角的遥感图像、激光雷达(LIDAR)数据、数字表面模型(DSM)等数据,从而降低了建筑物高度的预测成本与难度。
以上是对本公开示出的建筑物底座预测与建筑物高度预测方案的介绍,以下介绍图像处理模型的训练方法。
在本公开中,建筑物底座预测方案使用的图像处理模型可以包括用于输出上述屋顶区域的屋顶区域预测子模型,用于输出上述侧面底部边缘与上述侧面顶部边缘的建筑物边缘预测子模型,用于输出上述边缘方向的建筑物边缘方向预测子模型,以及用于输出上述偏移角的偏移角预测子模型。
为了提升图像处理模型对底座区域的预测精确度以及模型的泛化能力,在对图像处理模型进行训练时采用多任务联合训练方式。
在一些实施例中,为了增加对图像处理模型进行训练时采用的监督信息,从而提升图像处理模型预测精度,可以引入建筑物屋顶区域与侧面区域,建筑物轮廓包括的各边缘,建筑物包括的各像素点对应的边缘方向,屋顶与底座之间的偏移角等方面的约束。
请参见图11,图11为本公开示出的一种图像处理模型训练方法的方法流程图。
如图11所示,该方法包括:
S1102,获取多个涉及建筑物并包括标注信息的训练样本;其中,上述标注信息包括上述建筑物的屋顶区域与侧面区域、建筑物轮廓包括的各边缘、上述建筑物包括的各像素点对应的边缘方向、以及上述建筑物的屋顶与底座之间的偏移角。
在执行本步骤时,可以采用人工标注或机器辅助标注的方式对原始图像进行标注信息标注。例如,在获取到原始图像后,一方面,可以使用图像标注软件对原始图像包括的每一像素点标注其属于建筑物屋顶、侧面区域或是背景;标注其属于建筑物轮廓包括的哪一边缘;标注其对应的边缘方向;另一方面,可以标注该图像中包括的建筑物屋顶与底座之间的偏移角。在针对原始图像完成上述标注操作后可以得到训练样本。本公开实施例中,在基于标注信息编码训练样本时可以采用one-hot编码等方式进行编码,本公开不对编码的具体方式进行限定。
S1104,基于上述图像处理模型包括的各子模型分别对应的损失信息,构建联合学习损失信息。
在执行本步骤时,可以先确定各子模型各自对应的损失信息。为了提升子模型预测精准度,在本公开中,上述各子模型对应的损失信息均为交叉熵损失信息。
在确定各子模型各自对应的损失信息后,可以基于上述图像处理模型包括的各子模型分别对应的损失信息,构建联合学习损失信息。例如,可以将各子模型各自对应 的损失信息相加得到上述联合学习损失信息。
本公开实施例中,还可以为上述联合学习损失信息增加正则化项,在此不作特别限定。
在确定联合学习损失信息,以及训练样本后,可以执行S1106,基于上述联合学习损失信息,利用多个上述训练样本对上述图像处理模型包括的各子模型进行联合训练,直至上述各子模型收敛。
在对模型训练时,可以先指定诸如学习率、训练循环次数等超参数。在确定上述超参数之后,可以基于标注了标注信息的上述训练样本对上述图像处理模型进行有监督训练。
在有监督训练过程中,可以在图像处理模型进行前向传播得到计算结果后,基于构建的联合学习损失信息评价标注信息与上述计算结果之间的误差。在得到误差之后,可以采用随机梯度下降法确定下降梯度。在确定下降梯度后,可以基于反向传播更新上述图像处理模型对应的模型参数。重复上述过程,直至上述各子模型模型收敛。本公开实施例不对模型收敛的条件进行特别限定。
在对图像处理模型进行训练时,由于采用了有监督式的联合训练方法,因此,可以对该图像处理包括的四个子模型进行同时训练,使得各子模型之间在训练过程中既可以相互约束,又可以相互促进,从而一方面提高图像处理模型收敛效率;另一方面促进各子模型共用的骨干网络预测到对底座区域预测更有益的特征,从而提升底座预测精准度。
与上述任一实施例相对应的,本公开还提出一种图像处理装置。
请参见图12,图12为本公开示出的一种图像处理装置的示意图。
如图12所示,上述装置1200包括:获取模块1210,用于获取包含建筑物的目标图像;图像处理模块1220,用于对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角;偏移量确定模块1230,用于根据上述偏移角,确定上述侧面底部边缘与上述侧面顶部边缘之间的偏移量;变换模块1240,用于根据上述偏移量,对上述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
在示出的一些例子中,上述装置1200还包括:建筑物高度确定模块,用于基于上述偏移量以及预先确定的建筑物高度与偏移量之间的比例尺,确定上述建筑物的高度。
在示出的一些例子中,上述装置1200还包括:边缘方向确定模块,用于对上述目标图像进行图像处理,确定上述建筑物的屋顶轮廓包括的各像素点分别对应的边缘方向;上述装置1200还包括:规则化处理模块,基于上述边缘方向,对上述屋顶轮廓进行规则化处理,得到上述建筑物对应的屋顶多边形;上述变换模块1240,具体用于:根据上述偏移量,对上述屋顶多边形进行变换,得到上述底座轮廓。
在示出的一些例子中,上述规则化处理模块包括:第一确定子模块,用于将上述屋顶轮廓包括的各像素点中任一像素点作为目标像素点,确定上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差;第二确定子模块,用于在上述目标像素点对应的边缘方向与上述相邻像素点对应的边缘方向之方向差达到第一预设阈值的情况下,将上述目标像素点确定为上述建筑物对应的屋顶多边形的顶点;屋顶多边形确定子模块,基于确定的屋顶多边形的顶点,得到上述建筑物对应的屋顶多边形。
在示出的一些例子中,上述装置1200还包括:划分模块。将预设角度进行划分 得到N个角度区间,并为上述N个角度区间分配标识值;其中,N为正整数;上述第一确定子模块,具体用于:确定上述目标像素点对应的边缘方向所属的第一角度区间;确定上述目标像素点的相邻像素点对应的边缘方向所属的第二角度区间;将上述第一角度区间对应的标识值与上述第二角度区间对应的标识值之差确定为上述目标像素点对应的边缘方向与上述目标像素点的相邻像素点对应的边缘方向之方向差。
在示出的一些例子中,N为小于等于第二预设阈值的正整数。
在示出的一些例子中,上述装置1200还包括:顶点修正模块,基于顶点修正模型,对上述屋顶多边形的顶点进行修正,得到修正后的屋顶多边形;其中,上述顶点修正模型为基于图神经网络确定的模型。
在示出的一些例子中,上述偏移量确定模块1230,包括:偏移量确定子模块,用于确定上述侧面底部边缘按照上述偏移角的方向移动至上述侧面顶部边缘的位置变化量,并将上述位置变化量作为上述偏移量。
在示出的一些例子中,上述偏移确定子模块,具体用于:基于与上述侧面顶部边缘对应的预设边框,对上述侧面顶部边缘对应的侧面顶部边缘概率图进行裁剪,得到第一裁剪结果,上述侧面顶部边缘概率图包括上述目标图像中包含上述侧面顶部边缘的图像区域;按照预设步长以及预设最大偏移量,将上述侧面底部边缘按照上述偏移角方向进行多次移动,并在每次移动后,基于上述预设边框对上述侧面底部边缘对应的侧面底部边缘概率图进行裁剪,得到多个第二裁剪结果,上述侧面底部边缘概率图包括上述目标图像中包含上述侧面底部边缘的图像区域;在上述多个第二裁剪结果中,确定与上述第一裁剪结果匹配的目标裁剪结果,并将得到上述目标裁剪结果时上述侧面底部边缘的位置变化量确定为上述偏移量。
在示出的一些例子中,上述装置1200还包括:将上述侧面顶部边缘对应的外切边框,确定为上述预设边框;或,将上述屋顶轮廓包括的多个侧面顶部边缘进行组合得到的组合边缘所对应的外切边框,确定为上述预设边框。
在示出的一些例子中,上述图像处理模块1220,具体用于:利用图像处理模型对上述目标图像进行图像处理,确定上述建筑物的屋顶区域,上述建筑物的侧面底部边缘与侧面顶部边缘,以及上述建筑物的屋顶与底座之间的偏移角;上述图像处理模型包括用于输出上述屋顶区域的屋顶区域预测子模型,用于输出上述侧面底部边缘与上述侧面顶部边缘的建筑物边缘预测子模型,用于输出上述边缘方向的建筑物边缘方向预测子模型,以及用于输出上述偏移角的偏移角预测子模型。
在示出的一些例子中,上述图像处理模型的训练方法对应的训练装置1300包括:训练样本获取模块1310,用于获取多个涉及建筑物并包括标注信息的训练样本;其中,上述标注信息包括上述建筑物的屋顶区域与侧面区域,建筑物轮廓包括的各边缘、上述建筑物包括的各像素点对应的边缘方向以及上述建筑物的屋顶与底座之间的偏移角;损失信息确定模块1320,用于基于上述图像处理模型包括的各上述子模型分别对应的损失信息,构建联合学习损失信息;联合训练模块1330,用于基于上述联合学习损失信息,利用多个上述训练样本对上述图像处理模型包括的各上述子模型进行联合训练,直至各上述子模型收敛。
本公开示出的图像处理装置的实施例可以应用于电子设备上。相应地,本公开公开了一种电子设备,该设备可以包括:处理器。用于存储处理器可执行指令的存储器。其中,上述处理器被配置为调用上述存储器中存储的可执行指令,实现如上述任一实施例示出的图像处理方法。
请参见图13,图13为本公开示出的一种电子设备的硬件结构图。
如图13所示,该电子设备可以包括用于执行指令的处理器,用于进行网络连接的网络接口,用于为处理器存储运行数据的内存,以及用于存储图像处理装置对应指令的非易失性存储器。
其中,图像处理装置实施例可以通过软件实现,也可以通过硬件或者软硬件结合的方式实现。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在电子设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,除了图13所示的处理器、内存、网络接口、以及非易失性存储器之外,实施例中装置所在的电子设备通常根据该电子设备的实际功能,还可以包括其他硬件,对此不再赘述。
可以理解的是,为了提升处理速度,图像处理装置对应指令也可以直接存储于内存中,在此不作限定。
本公开提出一种计算机可读存储介质,上述存储介质存储有计算机程序,上述计算机程序用于执行实现如上述任一实施例示出的图像处理方法。
本领域技术人员应明白,本公开一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本公开一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本公开一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(可以包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本公开中的“和/或”表示至少具有两者中的其中一个,例如,“A和/或B”可以包括三种方案:A、B、以及“A和B”。
本公开中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于数据处理设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
上述对本公开特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的行为或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
本公开中描述的主题及功能操作的实施例可以在以下中实现:数字电子电路、有形体现的计算机软件或固件、可以包括本公开中公开的结构及其结构性等同物的计算机硬件、或者它们中的一个或多个的组合。本公开中描述的主题的实施例可以实现为一个或多个计算机程序,即编码在有形非暂时性程序载体上以被数据处理装置执行或控制数据处理装置的操作的计算机程序指令中的一个或多个模块。可替代地或附加地,程序指令可以被编码在人工生成的传播信号上,例如机器生成的电、光或电磁信号,该信号被生成以将信息编码并传输到合适的接收机装置以由数据处理装置执行。计算机存储介质可以是机器可读存储设备、机器可读存储基板、随机或串行存取存储器设备、或它们中的一个或多个的组合。
本公开中描述的处理及逻辑流程可以由执行一个或多个计算机程序的一个或多个可编程计算机执行,以通过根据输入数据进行操作并生成输出来执行相应的功能。上述处理及逻辑流程还可以由专用逻辑电路—例如FPGA(现场可编程门阵列)或ASIC (专用集成电路)来执行,并且装置也可以实现为专用逻辑电路。
适合用于执行计算机程序的计算机可以包括,例如通用和/或专用微处理器,或任何其他类型的中央处理单元。通常,中央处理单元将从只读存储器和/或随机存取存储器接收指令和数据。计算机的基本组件可以包括用于实施或执行指令的中央处理单元以及用于存储指令和数据的一个或多个存储器设备。通常,计算机还将可以包括用于存储数据的一个或多个大容量存储设备,例如磁盘、磁光盘或光盘等,或者计算机将可操作地与此大容量存储设备耦接以从其接收数据或向其传送数据,抑或两种情况兼而有之。然而,计算机不是必须具有这样的设备。此外,计算机可以嵌入在另一设备中,例如移动电话、个人数字助理(PDA)、移动音频或视频播放器、游戏操纵台、全球定位系统(GPS)接收机、或例如通用串行总线(USB)闪存驱动器的便携式存储设备,仅举几例。
适合于存储计算机程序指令和数据的计算机可读介质可以包括所有形式的非易失性存储器、媒介和存储器设备,例如可以包括半导体存储器设备(例如EPROM、EEPROM和闪存设备)、磁盘(例如内部硬盘或可移动盘)、磁光盘以及CD ROM和DVD-ROM盘。处理器和存储器可由专用逻辑电路补充或并入专用逻辑电路中。
虽然本公开包含许多具体实施细节,但是这些不应被解释为限制任何公开的范围或所要求保护的范围,而是主要用于描述特定公开的具体实施例的特征。本公开内在多个实施例中描述的某些特征也可以在单个实施例中被组合实施。另一方面,在单个实施例中描述的各种特征也可以在多个实施例中分开实施或以任何合适的子组合来实施。此外,虽然特征可以如上上述在某些组合中起作用并且甚至最初如此要求保护,但是来自所要求保护的组合中的一个或多个特征在一些情况下可以从该组合中去除,并且所要求保护的组合可以指向子组合或子组合的变型。
类似地,虽然在附图中以特定顺序描绘了操作,但是这不应被理解为要求这些操作以所示的特定顺序执行或顺次执行、或者要求所有例示的操作被执行,以实现期望的结果。在某些情况下,多任务和并行处理可能是有利的。此外,上述实施例中的各种系统模块和组件的分离不应被理解为在所有实施例中均需要这样的分离,并且应当理解,所描述的程序组件和系统通常可以一起集成在单个软件产品中,或者封装成多个软件产品。
由此,主题的特定实施例已被描述。其他实施例在所附权利要求书的范围以内。在某些情况下,权利要求书中记载的动作可以以不同的顺序执行并且仍实现期望的结果。此外,附图中描绘的处理并非必需所示的特定顺序或顺次顺序,以实现期望的结果。在某些实现中,多任务和并行处理可能是有利的。
以上上述仅为本公开一个或多个实施例的较佳实施例而已,并不用以限制本公开一个或多个实施例,凡在本公开一个或多个实施例的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开一个或多个实施例保护的范围之内。

Claims (15)

  1. 一种图像处理方法,其特征在于,所述方法包括:
    获取包含建筑物的目标图像;
    对所述目标图像进行图像处理,确定所述建筑物的屋顶区域、所述建筑物的侧面底部边缘与侧面顶部边缘、以及所述建筑物的屋顶与底座之间的偏移角;
    根据所述偏移角,确定所述侧面底部边缘与所述侧面顶部边缘之间的偏移量;
    根据所述偏移量,对所述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    对所述目标图像进行图像处理,确定所述建筑物的屋顶轮廓包括的各像素点分别对应的边缘方向;
    基于所述边缘方向,对所述屋顶轮廓进行规则化处理,得到所述建筑物对应的屋顶多边形;
    所述根据所述偏移量,对所述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓,包括:
    根据所述偏移量,对所述屋顶多边形进行变换,得到所述底座轮廓。
  3. 根据权利要求2所述的方法,其特征在于,所述基于所述边缘方向,对所述屋顶轮廓进行规则化处理,得到所述建筑物对应的屋顶多边形,包括:
    将所述屋顶轮廓包括的各像素点中任一像素点作为目标像素点,
    确定所述目标像素点对应的边缘方向与所述目标像素点的相邻像素点对应的边缘方向之方向差;
    在所述方向差达到第一预设阈值的情况下,将所述目标像素点确定为所述建筑物对应的屋顶多边形的顶点;
    基于确定的屋顶多边形的顶点,得到所述建筑物对应的屋顶多边形。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    将预设角度进行划分得到N个角度区间,并为所述N个角度区间分配标识值;其中,N为正整数;
    所述确定所述目标像素点对应的边缘方向与所述目标像素点的相邻像素点对应的边缘方向之方向差,包括:
    确定所述目标像素点对应的边缘方向所属的第一角度区间;
    确定所述目标像素点的相邻像素点对应的边缘方向所属的第二角度区间;
    将所述第一角度区间对应的标识值与所述第二角度区间对应的标识值之差,确定为所述目标像素点对应的边缘方向与所述目标像素点的相邻像素点对应的边缘方向之方向差。
  5. 根据权利要求4所述的方法,其特征在于,N为小于等于第二预设阈值的正整数。
  6. 根据权利要求3至5任一所述的方法,其特征在于,所述方法还包括:
    基于顶点修正模型,对所述屋顶多边形的顶点进行修正,得到修正后的屋顶多边形;其中,所述顶点修正模型为基于图神经网络确定的模型。
  7. 根据权利要求1至6任一所述的方法,其特征在于,所述根据所述偏移角,确定所述侧面底部边缘与所述侧面顶部边缘之间的偏移量,包括:
    确定所述侧面底部边缘按照所述偏移角的方向移动至所述侧面顶部边缘的位置变化量,作为所述偏移量。
  8. 根据权利要求7所述的方法,其特征在于,所述确定所述侧面底部边缘按照所述偏移角的方向移动至所述侧面顶部边缘的位置变化量作为所述偏移量,包括:
    基于与所述侧面顶部边缘对应的预设边框,对所述侧面顶部边缘对应的侧面顶部边缘概率图进行裁剪,得到第一裁剪结果,所述侧面顶部边缘概率图包括所述目标图像中包含所述侧面顶部边缘的图像区域;
    按照预设步长以及预设最大偏移量,将所述侧面底部边缘按照所述偏移角方向进行多次移动,并在每次移动后,基于所述预设边框对所述侧面底部边缘对应的侧面底部边缘概率图进行裁剪,得到多个第二裁剪结果,所述侧面底部边缘概率图包括所述目标图像中包含所述侧面底部边缘的图像区域;
    在所述多个第二裁剪结果中,确定与所述第一裁剪结果匹配的目标裁剪结果,并
    将得到所述目标裁剪结果时所述侧面底部边缘的位置变化量确定为所述偏移量。
  9. 根据权利要求8所述的方法,其特征在于,所述预设边框通过以下任一确定:
    将所述侧面顶部边缘对应的外切边框,确定为所述预设边框;或,
    将所述屋顶轮廓包括的多个侧面顶部边缘进行组合得到的组合边缘所对应的外切边框,确定为所述预设边框。
  10. 根据权利要求2至9任一所述的方法,其特征在于,所述对所述目标图像进行图像处理,确定所述建筑物的屋顶区域、所述建筑物的侧面底部边缘与侧面顶部边缘、以及所述建筑物的屋顶与底座之间的偏移角,包括:
    利用图像处理模型对所述目标图像进行图像处理,确定所述建筑物的屋顶区域、所述建筑物的侧面底部边缘与侧面顶部边缘、以及所述建筑物的屋顶与底座之间的偏移角;
    其中,所述图像处理模型包括用于输出所述屋顶区域的屋顶区域预测子模型、用于输出所述侧面底部边缘与所述侧面顶部边缘的建筑物边缘预测子模型、用于输出所述边缘方向的建筑物边缘方向预测子模型、以及用于输出所述偏移角的偏移角预测子模型。
  11. 根据权利要求10所述的方法,其特征在于,所述图像处理模型通过以下训练得到:
    获取多个涉及建筑物并包括标注信息的训练样本;其中,所述标注信息包括所述建筑物的屋顶区域与侧面区域、建筑物轮廓包括的各边缘,所述建筑物包括的各像素点对应的边缘方向、以及所述建筑物的屋顶与底座之间的偏移角;
    基于所述图像处理模型包括的各所述子模型分别对应的损失信息,构建联合学习损失信息;
    基于所述联合学习损失信息,利用多个所述训练样本对所述图像处理模型包括的各所述子模型进行联合训练,直至各所述子模型收敛。
  12. 根据权利要求1至11任一所述的方法,其特征在于,所述方法还包括:
    基于所述偏移量、以及预先确定的建筑物高度与偏移量之间的比例尺,确定所述建筑物的建筑物高度。
  13. 一种图像处理装置,其特征在于,所述装置包括:
    获取模块,用于获取包含建筑物的目标图像;
    图像处理模块,用于对所述目标图像进行图像处理,确定所述建筑物的屋顶区域、所述建筑物的侧面底部边缘与侧面顶部边缘、及所述建筑物的屋顶与底座之间的偏移角;
    偏移量确定模块,用于根据所述偏移角,确定所述侧面底部边缘与所述侧面顶部边缘之间的偏移量;
    变换模块,用于根据所述偏移量,对所述屋顶区域对应的屋顶轮廓进行变换,得到底座轮廓。
  14. 一种电子设备,其特征在于,所述设备包括处理器和用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器中存储的可执行指令,实现权利要求1至12中任一项所述的图像处理方法。
  15. 一种计算机可读存储介质,其特征在于,所述存储介质存储有计算机程序,所述计算机程序用于执行权利要求1至12中任一项所述的图像处理方法。
PCT/CN2021/115515 2020-09-27 2021-08-31 一种图像处理方法、装置、设备和存储介质 WO2022062854A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011036378.9 2020-09-27
CN202011036378.9A CN112037220A (zh) 2020-09-27 2020-09-27 一种图像处理方法、装置、设备和存储介质

Publications (1)

Publication Number Publication Date
WO2022062854A1 true WO2022062854A1 (zh) 2022-03-31

Family

ID=73574957

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/115515 WO2022062854A1 (zh) 2020-09-27 2021-08-31 一种图像处理方法、装置、设备和存储介质

Country Status (2)

Country Link
CN (1) CN112037220A (zh)
WO (1) WO2022062854A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113256790A (zh) * 2021-05-21 2021-08-13 珠海金山网络游戏科技有限公司 建模方法及装置
CN116342591A (zh) * 2023-05-25 2023-06-27 兴润建设集团有限公司 一种建筑物参数的视觉解析方法
CN117455815A (zh) * 2023-10-18 2024-01-26 二十一世纪空间技术应用股份有限公司 基于卫星影像平顶建筑物顶底偏移校正方法及相关设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037220A (zh) * 2020-09-27 2020-12-04 上海商汤智能科技有限公司 一种图像处理方法、装置、设备和存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1526108A (zh) * 2001-02-14 2004-09-01 无线谷通讯有限公司 对地形、建筑、和基础设施进行建模和管理的方法和系统
CN104240247A (zh) * 2014-09-10 2014-12-24 无锡儒安科技有限公司 一种基于单张图片的建筑物俯视轮廓的快速提取方法
CN112037220A (zh) * 2020-09-27 2020-12-04 上海商汤智能科技有限公司 一种图像处理方法、装置、设备和存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1526108A (zh) * 2001-02-14 2004-09-01 无线谷通讯有限公司 对地形、建筑、和基础设施进行建模和管理的方法和系统
CN104240247A (zh) * 2014-09-10 2014-12-24 无锡儒安科技有限公司 一种基于单张图片的建筑物俯视轮廓的快速提取方法
CN112037220A (zh) * 2020-09-27 2020-12-04 上海商汤智能科技有限公司 一种图像处理方法、装置、设备和存储介质

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KEQI ZHANG, JIANHUA YAN, AND SHU-CHING CHEN: "Automatic Construction of Building Footprints From Airborne LIDAR Data", IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 1 October 2006 (2006-10-01), pages 1 - 11, XP055914558, [retrieved on 20220421] *
XU, YONGZHI: "Extracting Building Footprints from Digital Measurable Images", CHINESE MASTER'S THESES FULL-TEXT DATABASE, ENGINEERING SCIENCE II, 15 February 2014 (2014-02-15), XP055914553, [retrieved on 20220421] *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113256790A (zh) * 2021-05-21 2021-08-13 珠海金山网络游戏科技有限公司 建模方法及装置
CN113256790B (zh) * 2021-05-21 2024-06-07 珠海金山数字网络科技有限公司 建模方法及装置
CN116342591A (zh) * 2023-05-25 2023-06-27 兴润建设集团有限公司 一种建筑物参数的视觉解析方法
CN117455815A (zh) * 2023-10-18 2024-01-26 二十一世纪空间技术应用股份有限公司 基于卫星影像平顶建筑物顶底偏移校正方法及相关设备

Also Published As

Publication number Publication date
CN112037220A (zh) 2020-12-04

Similar Documents

Publication Publication Date Title
CN108717710B (zh) 室内环境下的定位方法、装置及系统
WO2022062854A1 (zh) 一种图像处理方法、装置、设备和存储介质
WO2022062543A1 (zh) 一种图像处理方法、装置、设备和存储介质
US10953545B2 (en) System and method for autonomous navigation using visual sparse map
RU2713611C2 (ru) Способ моделирования трехмерного пространства
CN113256712B (zh) 定位方法、装置、电子设备和存储介质
US11941831B2 (en) Depth estimation
JP2019087229A (ja) 情報処理装置、情報処理装置の制御方法及びプログラム
CN110276768B (zh) 图像分割方法、图像分割装置、图像分割设备及介质
US20190301871A1 (en) Direct Sparse Visual-Inertial Odometry Using Dynamic Marginalization
CN110111364B (zh) 运动检测方法、装置、电子设备及存储介质
US11790661B2 (en) Image prediction system
US20220164603A1 (en) Data processing method, data processing apparatus, electronic device and storage medium
CN114641800A (zh) 用于预报人群动态的方法和系统
CN115421158A (zh) 自监督学习的固态激光雷达三维语义建图方法与装置
CN113658203A (zh) 建筑物三维轮廓提取及神经网络的训练方法和装置
US20220164595A1 (en) Method, electronic device and storage medium for vehicle localization
CN117132649A (zh) 人工智能融合北斗卫星导航的船舶视频定位方法及装置
Ribacki et al. Vision-based global localization using ceiling space density
CN113916223B (zh) 定位方法及装置、设备、存储介质
US11915449B2 (en) Method and apparatus for estimating user pose using three-dimensional virtual space model
KR20230029981A (ko) 포즈 결정을 위한 시스템 및 방법
US10447992B1 (en) Image processing method and system
Abdelaal et al. Gramap: Qos-aware indoor mapping through crowd-sensing point clouds with grammar support
Kanai et al. Improvement of 3D Monte Carlo localization using a depth camera and terrestrial laser scanner

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21871226

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022545960

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04.09.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21871226

Country of ref document: EP

Kind code of ref document: A1