WO2022048468A1 - 平面轮廓识别方法、装置、计算机设备和存储介质 - Google Patents

平面轮廓识别方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2022048468A1
WO2022048468A1 PCT/CN2021/114064 CN2021114064W WO2022048468A1 WO 2022048468 A1 WO2022048468 A1 WO 2022048468A1 CN 2021114064 W CN2021114064 W CN 2021114064W WO 2022048468 A1 WO2022048468 A1 WO 2022048468A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame image
target
plane
edge points
object plane
Prior art date
Application number
PCT/CN2021/114064
Other languages
English (en)
French (fr)
Inventor
张晟浩
凌永根
迟万超
郑宇�
姜鑫洋
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP21863535.7A priority Critical patent/EP4131162A4/en
Publication of WO2022048468A1 publication Critical patent/WO2022048468A1/zh
Priority to US17/956,364 priority patent/US20230015214A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/149Segmentation; Edge detection involving deformable models, e.g. active contour models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/162Segmentation; Edge detection involving graph-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/181Segmentation; Edge detection involving edge growing; involving edge linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the present application relates to the field of computer technology, and in particular, to a method, apparatus, computer device and storage medium for identifying a plane outline.
  • the method of deep learning is usually used to identify the plane contour of the target, so as to use the identified plane contour to determine the specific spatial position and size of the target.
  • using the deep learning method in the traditional scheme to identify the plane contour of the target object often relies too much on pre-training for the recognition accuracy, and the training takes a long time, thus affecting the efficiency of the plane contour recognition of the target object.
  • a method, apparatus, computer device, and storage medium for identifying a plane contour are provided.
  • a fitting graph obtained by fitting the edge points of each object plane in the target frame image and the edge points of the corresponding object plane in the previous frame image is superimposed on the object plane for display;
  • the previous frame The image is a frame image collected from the target environment before the target frame image;
  • a plane contour identification device the device comprises:
  • the first display module is used to display the target frame image obtained by collecting the target environment
  • a superimposing module for superimposing a fitting graph formed by fitting the edge points of each object plane in the target frame image and the edge points of the corresponding object planes in the previous frame image on the object plane for display;
  • the previous frame image is a frame image collected from the target environment before the target frame image;
  • a deletion module for deleting edge points that do not appear in the object plane of the previous frame image in the fitting graph
  • the second display module is configured to display, on the object plane of the target frame image, the plane outline formed by the remaining edge points in the fitting graph.
  • a computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:
  • a fitting graph obtained by fitting the edge points of each object plane in the target frame image and the edge points of the corresponding object plane in the previous frame image is superimposed on the object plane for display;
  • the previous frame The image is a frame image collected from the target environment before the target frame image;
  • a fitting graph obtained by fitting the edge points of each object plane in the target frame image and the edge points of the corresponding object plane in the previous frame image is superimposed on the object plane for display;
  • the edge points that do not appear in the object plane of the previous frame image are deleted;
  • the previous frame image is a frame acquired from the target environment before the target frame image image;
  • a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium; from which a processor of a computer device reads and writes When the computer instructions are executed, the computer device is caused to execute the steps of the above-mentioned plane contour recognition method.
  • edge points of each object plane in the target frame image and the edge points of the corresponding object planes in the previous frame image are fitted to obtain a fitted graph; Describe the frame image collected by the target environment;
  • a plane contour identification device the device comprises:
  • an acquisition module used for acquiring the target frame image obtained by collecting the target environment
  • the fitting module is used for fitting the edge points of each object plane in the target frame image and the edge points of the corresponding object plane in the previous frame image to obtain a fitting figure;
  • the previous frame image is in the The frame image collected from the target environment before the target frame image;
  • a deletion module for deleting edge points that do not appear in the object plane of the previous frame image in the fitting graph
  • a building module is used to identify the contour formed by the remaining edge points in the fitting graph as a plane contour.
  • a computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:
  • edge points of each object plane in the target frame image and the edge points of the corresponding object planes in the previous frame image are fitted to obtain a fitted graph; Describe the frame image collected by the target environment;
  • edge points of each object plane in the target frame image and the edge points of the corresponding object planes in the previous frame image are fitted to obtain a fitted graph; Describe the frame image collected by the target environment;
  • a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium; from which a processor of a computer device reads and writes When the computer instructions are executed, the computer device is caused to execute the steps of the above-mentioned plane contour recognition method.
  • Fig. 1 is the application environment diagram of the plane outline identification method in one embodiment
  • FIG. 2 is a schematic flowchart of a method for identifying a plane contour in one embodiment
  • FIG. 3 is a schematic diagram of displaying a fitted figure and a plane outline on a target frame image in one embodiment
  • FIG. 4 is a schematic diagram of a plane outline fitted to two staggered cubic objects in one embodiment
  • FIG. 5 is a schematic diagram of optimizing the fitting graph of two staggered cubic objects in one embodiment, and fitting a plane outline according to the optimized fitting graph;
  • FIG. 6 is a schematic flowchart of a method for identifying a plane contour in another embodiment
  • FIG. 7 is a schematic diagram of constructing a spatial coordinate system in one embodiment
  • FIG. 8 is a schematic flowchart of a method for identifying a plane contour in another embodiment
  • FIG. 9 is a schematic diagram of the comparison between a rectangular outline fitted without using Apriltag and a rectangular outline fitted by Apriltag in one embodiment
  • Fig. 10 is a structural block diagram of a plane contour identification device in one embodiment
  • FIG. 11 is a structural block diagram of a plane contour identification device in another embodiment
  • FIG. 12 is a structural block diagram of a plane contour identification device in another embodiment
  • FIG. 13 is a structural block diagram of a plane contour identification device in another embodiment
  • Figure 14 is an internal structure diagram of a computer device in one embodiment
  • FIG. 15 is an internal structure diagram of a computer apparatus in another embodiment.
  • the plane contour recognition method provided by the present application can be applied to the application environment as shown in FIG. 1 .
  • the terminal 102 and the server 104 are included.
  • the terminal 102 collects the target environment to obtain the target frame image, and displays the collected target frame graphics;
  • the terminal 102 performs fitting on the edge points of each object plane in the target frame image and the edge points of the corresponding object planes in the previous frame image to obtain the fitting Graphics, and then superimpose the fitted graphic on the object plane for display; in the fitted graphic, delete the edge points that do not appear in the object plane of the object in the previous frame image, the previous frame image is before the target frame image.
  • the frame image collected by the target environment; on the object plane of the target frame image, the outline of the plane formed by the remaining edge points in the fitting graph is displayed.
  • the fitting process of the fitting graph may include: the terminal 102 may also send the target frame image and the previous frame image to the server 104, and the server 104 performs fitting on the edge points of each object plane in the target frame image to obtain the fitting image. Then, the fitted graph is sent to the terminal 102 for superimposing on the object plane for display.
  • the terminal 102 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.
  • the server 104 may be an independent physical server, or a server cluster composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud server, cloud database, cloud storage, and CDN.
  • the terminal 102 and the server 104 may be connected through a communication connection method such as Bluetooth, USB (Universal Serial Bus, Universal Serial Bus) or a network, which is not limited in this application.
  • a communication connection method such as Bluetooth, USB (Universal Serial Bus, Universal Serial Bus) or a network, which is not limited in this application.
  • a plane contour recognition method is provided, and the method can be executed by the terminal 102, or executed by the terminal 102 and the server 104 in cooperation, and the method is executed by the terminal 102 in FIG. 1 as Example to illustrate, including the following steps:
  • the target environment may be the working environment where the terminal is located, such as the environment in which the robot sorts items in the express delivery field, or the road environment when the robot walks in the working process.
  • the target frame image may refer to an image obtained by the terminal collecting an image or video of the target environment through a built-in camera, or an image obtained by collecting an image or video of the target environment through an independent camera.
  • the target frame image may be a three-dimensional image carrying depth information, such as a depth image.
  • the target frame image can be an image of a ladder in the target environment, as shown in FIG. 3 .
  • a camera built in the terminal collects a target environment in real time to obtain a video
  • the video is decoded to obtain a target frame image, and then each target frame image obtained by decoding is displayed.
  • an independent camera collects the target environment in real time to obtain a video
  • the video is sent to the terminal in real time, and the terminal decodes the video to obtain the target frame image, and then displays each target frame image obtained by decoding.
  • the target frame image is a video frame in the video.
  • the captured target frame image is displayed on the display screen.
  • the captured target frame image is sent to the terminal, and the terminal displays the received target frame image.
  • the object plane may refer to the surface of the object displayed in the target frame image, that is, the surface of the object that is not occluded; the object plane may also appear in the previous frame image in addition to appearing in the target frame image.
  • the object plane can be a horizontal plane or a curved plane. In the following embodiments, an object plane whose object plane is horizontal is used as an example for description.
  • the object may refer to a target object in the target environment and captured in the target frame image, and the surface of the object in all directions may be a horizontal plane; in addition, the object may have a specific geometric shape.
  • the object may be a courier box for express delivery, or a ladder or the like.
  • the fitted graph may refer to a closed curve graph obtained by fitting edge points with a curve, or a closed graph obtained by fitting edge points with a straight line.
  • the dotted line graph is a fitting graph formed by fitting the edge points of a certain step plane of the stairs.
  • the corresponding edge point may also be a voxel of the object plane in the target frame image, or a three-dimensional edge point of the target object in the target environment (ie, the real scene).
  • the corresponding edge points may also be two-dimensional pixels, or two-dimensional edge points of the target object in the target environment.
  • the fitting degree of the edge points of the object plane can be determined by superimposing the displayed fitting graph.
  • the fitting degree can be directly used as the plane outline of the object plane.
  • S206 may be executed.
  • the preset fitting condition may refer to whether the shape of the fitting graph on the object plane is the target shape, for example, for a rectangular object, whether the fitting graph is a rectangle; or, in a side view, whether the fitting graph is a rhombus.
  • the above-mentioned fitting graph is obtained by fitting the edge points of each object plane in the target frame image and the edge points of the corresponding object planes in the previous frame image through an edge point fitting step.
  • the fitting step includes: the terminal determines the spatial position corresponding to each point in the target frame image according to the depth information; determines the plane where each point is located based on the spatial position and the plane equation, and obtains the object plane; the edge points of the object plane correspond to the previous frame image The edge points of the object plane are fitted to obtain a fitted graph.
  • the terminal displays the obtained fitting graph by superimposing it on the object plane of the target frame image.
  • the spatial position may be a spatial coordinate in a three-dimensional spatial coordinate system.
  • the target frame image includes a graphic code that carries direction information.
  • the above step of determining the spatial position corresponding to each point in the target frame image according to the depth information may specifically include: the terminal determines the reference direction of the coordinate system according to the direction information carried by the graphic code; constructs a spatial coordinate system based on the reference direction of the coordinate system; , the spatial position corresponding to each point in the target frame image is determined based on the depth information.
  • the graphic code can be a barcode or a two-dimensional code, such as Apriltag used to indicate the direction.
  • the terminal can calculate the distance between each point of the object in the target frame image and the camera, and the distance between each point of the object according to the depth information, and each point of the object in the target frame image can be determined according to the above distance.
  • the spatial position in the spatial coordinate system can be calculated.
  • the step of determining the plane where each point on the surface of the object is located may specifically include: the terminal inputs the spatial position (that is, the spatial coordinates) and the plane equation into the plane fitting model, and the plane fitting model is used according to the plane equation.
  • the spatial position of each point of the object is fitted to obtain the plane where each point is located, and then the object plane of the plane is obtained.
  • the terminal uses a straight line or a curve to fit the edge points of the object plane to obtain a closed curve graph or a closed graph composed of straight line segments, and the closed graph or closed graph is determined as the edge point of the object plane. Fit the graph.
  • the above-mentioned step of fitting the edge point of the object plane and the edge point of the corresponding object plane of the previous frame image to obtain the fitted graph may specifically include: when the object plane is a partial area plane of the object, or blocked by other objects in the target frame image, the terminal determines the previous target frame image containing the object plane in each previous frame image; extracts edge points from the object plane in the previous target frame image; From the edge points extracted from the object plane in the frame image and the edge points of the object plane in the target frame image, the target edge point is selected; the selected target edge point is fitted to obtain a fitted graph. Then, the terminal displays the obtained fitting graph by superimposing it on the object plane of the target frame image.
  • the above-mentioned edge points are three-dimensional edge points.
  • the previous frame image is the frame image collected from the target environment before the target frame image, such as the previous frame image of the target frame image, or the first n frame images collected before the target frame image (that is, the previous frame image to the previous nth frame).
  • the previous target frame image refers to the image of the object plane containing the object in the previous frame image.
  • the selection process of the target edge point may include: firstly superimposing the previous target frame image and the target frame image, so that the edge points of the object plane in the target frame image and the edge points of the corresponding object plane in the previous target frame image are superimposed on each other. Obtain the edge point set together, and then select the edge point from the edge point dense area of the edge point set; among them, select the edge point from the edge point dense area to avoid the interference of discrete points. For example, after the edge points are superimposed, it may appear Some edge points deviate from the normal range, and excluding these edge points can improve the fitting effect.
  • weights for each frame of images according to the distance between the camera and the target object, and then select the target edge points according to the size of the weights, that is, select more target edge points from frame images with large weights, and select edge points from frames with small weights. Select fewer target edge points in the frame image.
  • the step of judging whether the object plane is a partial area plane of the object, or whether it is occluded by other objects in the target frame image may specifically include: the terminal maps three-dimensional edge points to two-dimensional edge points; determining The convex polygon corresponding to the two-dimensional edge point is calculated, and the area of the convex polygon is calculated; the circumscribed figure of the two-dimensional edge point is determined, and the area of the circumscribed figure is calculated; when the ratio of the area of the convex polygon to the area of the circumscribed figure reaches the preset ratio, Then, it is determined that the object plane is a partial area plane of the object, or is occluded by other objects in the target frame image.
  • convex polygon means: if any one of all sides of a polygon is infinitely extended to two sides into a straight line, all other sides are on the same side of the straight line, and the interior angles of the polygon should not be good angles, any A line segment between two vertices is inside or on an edge of a polygon.
  • the above-mentioned step of selecting the target edge point from the edge point extracted from the object plane in the previous target frame image and the edge point of the object plane in the target frame image may specifically include: the terminal determines the corresponding image of the target frame image. the first weight and the second weight corresponding to the previous target frame image; the first weight and the second weight are not equal; in the edge points of the object plane in the target frame image, the first target edge point is selected according to the first weight; and, Among the edge points extracted from the object plane in the previous target frame image, the second target edge point is selected according to the second weight; the first target edge point and the second target edge point are used as the target edge points.
  • the size of the first weight and the second weight is related to the distance between the camera and the object when the camera captures the image, that is, when the camera captures the object in the target environment, the farther the distance is, the smaller the corresponding weight; Similarly, the camera captures the target. For objects in the environment, the closer the distance is, the greater the corresponding weight.
  • the terminal determines the size of the fitted graphic; when the size is smaller than the preset size, reacquires the target frame image obtained by collecting the target environment, and then executes S204. When the size is greater than or equal to the preset size, perform S206.
  • the fitting process determines whether the size of the object plane corresponding to the object in the target environment are within a certain range. If the length and width of the object plane corresponding to the object in the target environment are within a certain range, during the fitting process, you can choose to check the size of the object plane. The fitting is determined to be successful only when the threshold is met.
  • the previous frame image is a frame image collected from the target environment before the target frame image, such as the previous frame image of the target frame image, or the first n frames of images collected before the target frame image, or the target frame image.
  • the nth frame image acquired before the frame image.
  • S206 may specifically include: if the edge point in the fitting graph does not appear in the object plane in the previous frame image, or if the edge point in the fitting graph does not appear on the object plane in the previous frame image If it appears in the corresponding plane outline, the edge point that does not appear will be deleted from the fitting graph.
  • the two objects in the figure are cubic objects, and the fitting graph of part of the surface of one object (the gray area in the figure) extends along the side of the other object, resulting in this kind of
  • the reason for the phenomenon is that there is a large error in the depth information in this case, and the plane fitting is wrong. Therefore, for the fitting graph fitted by the edge points of the object plane in the multi-frame image, the edge points in the fitting graph that do not appear in the object plane in the previous frame image are deleted (that is, the dotted line in the figure). Each edge point in the ellipse frame is deleted), wherein, the edge points that do not appear in the object plane in the previous frame image are outliers, which can avoid fitting errors in the fitting process, thereby improving the fitting graph. correctness.
  • the plane profile may refer to the profile of the surface of the object, and the profile may be a rectangular profile or a profile of other shapes.
  • the terminal may generate a plane graph circumscribing the remaining edge points in the fitting graph, where the plane graph is the plane outline of the object.
  • the dotted line part in the left figure is the original fitting graph
  • the dotted line part in the right figure is the edge point that does not appear in the object plane in the previous frame image is deleted from the fitting graph
  • the plane contour of the object is constructed according to the new fitting graph, so the recognition accuracy of the plane contour of the object can be improved.
  • the terminal can perform different operations in different application scenarios, and the operations performed by the terminal are described in the following application scenarios:
  • Scenario 1 the application scenario of robot movement.
  • the robot determines the robot movement path in each plane outline constructed by the target frame image, or selects the robot foothold in the plane outline; moves according to the robot movement path or the robot foothold.
  • Scenario 2 the application scenario of the robot gripping the target object.
  • the above-mentioned object plane is the area of the corresponding plane of the target object gripped by the robot arm.
  • the robot arm determines the size, orientation and spatial position of the target object according to the plane outline; The size, orientation and spatial position of the target object are clamped; the clamped target object is placed at the specified position.
  • the size may refer to at least two of the length, width, and height of the target object.
  • the orientation can refer to the direction the target object is facing, or it can refer to the direction in which the target object is placed, for example, the courier box is placed towards the robot arm.
  • the above-mentioned step of placing the clipped target object at a specified position may specifically include: during the process of moving the robotic arm to place the target object, the terminal collects an environmental image of the specified position; The object plane is fitted with the edge points corresponding to the target object plane in the environmental image of the previous frame, and the target fitting graph is obtained; in the target fitting graph, the edge points that do not appear in the target object plane of the environmental image of the previous frame are fitted Delete; the previous frame environment image is the frame image collected before the environment image; the target plane contour is formed by the remaining edge points in the target fitting graph; the placement posture of the target object is determined according to the target plane contour; the target object is placed according to the placement posture Placed on the target object plane.
  • the placement posture is the position and orientation of the target object.
  • the target plane contour corresponding to the other object can be identified through the above method, and then the target plane contour can be determined according to the target plane contour.
  • the size and spatial position of an object and then place the target object on the other object according to the size and spatial position of the other object to ensure that the target object avoids collision during the placement process, and is placed on the target object.
  • Scenario 3 the application scenario of augmented reality.
  • the terminal when the terminal is an AR device, after the plane outline of the object is obtained, the size and spatial position of the plane outline are determined, and a virtual identifier with respect to the size and space position is generated, which is displayed on the AR device about the target environment.
  • the virtual logo is displayed near the object in the real picture.
  • the target frame image is obtained by collecting the target environment, and by fitting the edge points of each object plane in the target frame image and the edge points of the corresponding object plane in the previous frame image, the fitting of each object plane can be obtained. If the edge point in the fitted graph does not appear in the object plane of the previous frame image, the edge point that does not appear will be deleted as an outlier point, so as to obtain the plane outline composed of the remaining edge points, so that there is no need to Using deep learning, the plane contour of each object in the target frame image can be recognized, which reduces the training time and effectively improves the recognition efficiency of the plane contour of the target object. In addition, the non-appearing edge points are deleted as outliers to obtain the plane contour of the object, so that multiple objects can be placed staggered to affect the identification of the plane contour, and the accuracy of the plane contour identification is improved.
  • FIG. 6 another method for identifying plane contours is provided, and the method can be executed by the terminal 102 , or by the server 104 , or executed by the terminal 102 and the server 104 in cooperation, with the method shown in FIG. 1 .
  • the execution of the terminal 102 in the example is described, including the following steps:
  • the target environment may be the working environment where the terminal is located, such as the environment in which the robot sorts items in the field of express delivery, or the road environment when the robot walks in the working process.
  • the target frame image may refer to an image obtained by the terminal collecting an image or video of the target environment through a built-in camera, or an image obtained by collecting an image or video of the target environment through an independent camera.
  • the target frame image may be a three-dimensional image carrying depth information, such as a depth image.
  • the target frame image can be an image of a ladder in the target environment, as shown in FIG. 3 .
  • the video when a camera built in the terminal collects a target environment in real time to obtain a video, the video is decoded to obtain a target frame image. Or, if an independent camera collects the target environment in real time to obtain a video, the video is sent to the terminal in real time, and the terminal decodes the video to obtain the target frame image.
  • the target frame image is a video frame in the video.
  • the terminal uses a built-in camera to capture the target environment at preset time intervals to obtain target frame images; or, captures the target environment through an independent camera at preset time intervals to obtain target frame images, and then captures the captured target frame images. The image is transmitted to the terminal.
  • the terminal may display the target frame image through its display screen.
  • the previous frame image is a frame image obtained by collecting the target environment before the target frame image.
  • the object plane may refer to the surface of the object displayed in the target frame image, that is, the surface of the object that is not occluded; in addition, the object plane may be a horizontal plane or a curved plane. In the following embodiments, an object plane whose object plane is horizontal is used as an example for description.
  • the object may refer to a target object in the target environment and captured in the target frame image, and the surface of the object in all directions may be a horizontal plane; in addition, the object may have a specific geometric shape.
  • the object may be a courier box for express delivery, or a ladder or the like.
  • the fitted graph may refer to a closed curve graph obtained by fitting edge points with a curve, or a closed graph obtained by fitting edge points with a straight line.
  • the dotted line graph is a fitting graph obtained by fitting the edge points of a certain step plane of the stairs. Since the target frame image may be a three-dimensional image, the corresponding edge point may also be a three-dimensional edge point.
  • the corresponding edge point may also be a voxel of the object plane in the target frame image, or a three-dimensional edge point of the target object in the target environment (ie, the real scene).
  • the corresponding edge points may also be two-dimensional pixels, or two-dimensional edge points of the target object in the target environment.
  • the terminal determines the spatial position corresponding to each point in the target frame image according to the depth information; determines the plane where each point is located based on the spatial position and the plane equation, and obtains the object plane; The edge points of the corresponding object plane are fitted to obtain a fitted graph.
  • the spatial position may be a spatial coordinate in a three-dimensional spatial coordinate system.
  • the target frame image includes a graphic code that carries direction information.
  • the above step of determining the spatial position corresponding to each point in the target frame image according to the depth information may specifically include: the terminal determines the reference direction of the coordinate system according to the direction information carried by the graphic code; constructs a spatial coordinate system based on the reference direction of the coordinate system; , the spatial position corresponding to each point in the target frame image is determined based on the depth information.
  • the graphic code can be a barcode or a two-dimensional code, such as Apriltag used to indicate the direction.
  • the coordinate reference direction (also called the positive direction) of the space coordinate system can be determined through Apriltag, the coordinate reference direction is taken as the x-axis direction, and then the normal direction of the object plane corresponding to the object is taken as In the z-axis direction, the cross product of the x-axis direction and the z-axis direction is the y-axis direction.
  • the x-axis direction is denoted as vector a
  • the z-axis direction is denoted as vector b
  • the cross product direction of vector a and vector b is denoted as the direction of a ⁇ b
  • the direction of a ⁇ b is: four fingers start from b Pointing to a
  • the pointing of the thumb is the direction of b ⁇ a, which is perpendicular to the plane where b and a are located.
  • the terminal can calculate the distance between each point of the object in the target frame image and the camera, and the distance between each point of the object according to the depth information, and each point of the object in the target frame image can be determined according to the above distance.
  • the spatial position in the spatial coordinate system can be calculated.
  • the step of determining the plane where each point on the surface of the object is located may specifically include: the terminal inputs the spatial position (that is, the spatial coordinates) and the plane equation into the plane fitting model, and the plane fitting model is used according to the plane equation. Fit the spatial position of each point of the object to obtain the plane where each point is located, and then obtain the object plane of the plane.
  • the terminal uses a straight line or a curve to fit the edge points of the object plane to obtain a closed curve graph or a closed graph composed of straight line segments, and the closed graph or closed graph is determined as the edge point of the object plane. Fit the graph.
  • the above-mentioned step of fitting the edge point of the object plane and the edge point of the corresponding object plane of the previous frame image to obtain the fitted graph may specifically include: when the object plane is a partial area plane of the object, or blocked by other objects in the target frame image, the terminal determines the previous target frame image containing the object plane in each previous frame image; extracts edge points from the object plane in the previous target frame image; From the edge points extracted from the object plane in the frame image and the edge points of the object plane in the target frame image, the target edge point is selected; the selected target edge point is fitted to obtain a fitted graph.
  • the above-mentioned edge points are three-dimensional edge points.
  • the previous frame image is the frame image collected from the target environment before the target frame image, such as the previous frame image of the target frame image, or the first n frame images collected before the target frame image (that is, the previous frame image to the previous nth frame).
  • the previous target frame image refers to the image of the object plane containing the object in the previous frame image.
  • the selection process of the target edge point may include: firstly superimposing the previous target frame image and the target frame image, so that the edge points of the object plane in the target frame image and the edge points of the corresponding object plane in the previous target frame image are superimposed on each other. Obtain the edge point set together, and then select the edge point from the edge point dense area of the edge point set; among them, select the edge point from the edge point dense area to avoid the interference of discrete points. For example, after the edge points are superimposed, it may appear Some edge points deviate from the normal range, and excluding these edge points can improve the fitting effect.
  • weights for each frame of images according to the distance between the camera and the target object, and then select the target edge points according to the size of the weights, that is, select more target edge points from frame images with large weights, and select edge points from frames with small weights. Select fewer target edge points in the frame image.
  • the step of judging whether the object plane is a partial area plane of the object, or whether it is occluded by other objects in the target frame image may specifically include: the terminal maps three-dimensional edge points to two-dimensional edge points; determining The convex polygon corresponding to the two-dimensional edge point is calculated, and the area of the convex polygon is calculated; the circumscribed figure of the two-dimensional edge point is determined, and the area of the circumscribed figure is calculated; when the ratio of the area of the convex polygon to the area of the circumscribed figure reaches the preset ratio, Then, it is determined that the object plane is a partial area plane of the object, or is occluded by other objects in the target frame image.
  • convex polygon means: if any one of all sides of a polygon is infinitely extended to two sides into a straight line, all other sides are on the same side of the straight line, and the interior angles of the polygon should not be good angles, any A line segment between two vertices is inside or on an edge of a polygon.
  • the above-mentioned step of selecting the target edge point from the edge point extracted from the object plane in the previous target frame image and the edge point of the object plane in the target frame image may specifically include: the terminal determines the corresponding image of the target frame image. the first weight and the second weight corresponding to the previous target frame image; the first weight and the second weight are not equal; in the edge points of the object plane in the target frame image, the first target edge point is selected according to the first weight; and, Among the edge points extracted from the object plane in the previous target frame image, the second target edge point is selected according to the second weight; the first target edge point and the second target edge point are used as the target edge points.
  • the size of the first weight and the second weight is related to the distance between the camera and the object when the camera captures the image, that is, when the camera captures the object in the target environment, the farther the distance is, the smaller the corresponding weight; Similarly, the camera captures the target. For objects in the environment, the closer the distance is, the greater the corresponding weight.
  • the terminal determines the size of the fitted graphic; when the size is smaller than the preset size, reacquires the target frame image obtained by collecting the target environment, and then executes S604. When the size is greater than or equal to the preset size, perform S606.
  • the fitting process determines whether the size of the object plane corresponding to the object in the target environment are within a certain range. If the length and width of the object plane corresponding to the object in the target environment are within a certain range, during the fitting process, you can choose to check the size of the object plane. The fitting is determined to be successful only when the threshold is met.
  • the previous frame image is a frame image collected from the target environment before the target frame image, such as the previous frame image of the target frame image, or the first n frames of images collected before the target frame image, or the target frame image.
  • the nth frame image acquired before the frame image.
  • S606 may specifically include: if the edge point in the fitting graph does not appear in the object plane in the previous frame image, or if the edge point in the fitting graph does not appear on the object plane in the previous frame image If it appears in the corresponding plane outline, the edge point that does not appear will be deleted from the fitting graph.
  • the two objects in the figure are cubic objects, and the fitting graph of part of the surface of one object (the gray area in the figure) extends along the side of the other object, resulting in this kind of
  • the reason for the phenomenon is that there is a large error in the depth information in this case, and the plane fitting is wrong. Therefore, for the fitting graph fitted by the edge points of the object plane in the multi-frame image, the edge points in the fitting graph that do not appear in the object plane in the previous frame image are deleted (that is, the dotted line in the figure). Each edge point in the ellipse frame is deleted), wherein, the edge points that do not appear in the object plane in the previous frame image are outliers, which can avoid fitting errors in the fitting process, thereby improving the fitting graph. correctness.
  • the plane contour may refer to the contour of the object plane of the target object, and the contour may be a rectangular contour or an contour of other shapes.
  • the terminal may generate a plane graph circumscribing the remaining edge points in the fitting graph, where the plane graph is the plane outline of the object.
  • the dotted line part in the left figure is the original fitting graph
  • the dotted line part in the right figure is the edge point that does not appear in the object plane in the previous frame image is deleted from the fitting graph
  • the plane contour of the object is constructed according to the new fitting graph, so the recognition accuracy of the plane contour of the object can be improved.
  • the terminal can perform different operations in different application scenarios, and the operations performed by the terminal are described in the following application scenarios:
  • Scenario 1 the application scenario of robot movement.
  • the robot determines the robot movement path in each plane outline constructed by the target frame image, or selects the robot foothold in the plane outline; moves according to the robot movement path or the robot foothold.
  • Scenario 2 the application scenario of the robot gripping the target object.
  • the above-mentioned object plane is the area of the corresponding plane of the target object.
  • the robotic arm determines the size, orientation and spatial position of the target object according to the plane outline; based on the size, orientation and spatial position of the target object Position to clamp the target object; place the clamped target object at the specified position.
  • the size may refer to at least two of the length, width, and height of the target object.
  • the orientation can refer to the direction the target object is facing, or it can refer to the direction in which the target object is placed, for example, the courier box is placed towards the robot arm.
  • the above-mentioned step of placing the clipped target object at a specified position may specifically include: during the process of moving the robotic arm to place the target object, the terminal collects an environmental image of the specified position; The object plane is fitted with the edge points corresponding to the target object plane in the environmental image of the previous frame, and the target fitting graph is obtained; in the target fitting graph, the edge points that do not appear in the target object plane of the environmental image of the previous frame are fitted Delete; the previous frame environment image is the frame image collected before the environment image; the target plane contour is formed by the remaining edge points in the target fitting graph; the placement posture of the target object is determined according to the target plane contour; the target object is placed according to the placement posture Placed on the target object plane.
  • the placement posture is the position and orientation of the target object.
  • the target plane contour corresponding to the other object can be identified through the above method, and then the target plane contour can be determined according to the target plane contour.
  • the size and spatial position of an object and then place the target object on the other object according to the size and spatial position of the other object to ensure that the target object avoids collision during the placement process, and is placed on the target object.
  • Scenario 3 the application scenario of augmented reality.
  • the terminal when the terminal is an AR device, after the plane outline of the object is obtained, the size and spatial position of the plane outline are determined, and a virtual identifier with respect to the size and space position is generated, which is displayed on the AR device about the target environment.
  • the virtual logo is displayed near the object in the real picture.
  • the target frame image is obtained by collecting the target environment, and the edge points of each object plane in the target frame image are fitted with the edge points of the corresponding object plane in the previous frame image, so as to obtain the fitting graph of each object plane.
  • the edge point in the fitting graph does not appear in the object plane of the previous frame image
  • the edge point that does not appear is deleted as an outlier point, so as to obtain the plane outline composed of the remaining edge points, reducing the The time-consuming training caused by the use of deep learning effectively improves the recognition efficiency of the plane outline of the target.
  • the non-appearing edge point is deleted as an outlier to obtain the plane outline of the object, so that multiple objects can be interleaved to affect the recognition of the plane outline, and the accuracy of the plane outline recognition is improved.
  • the above method is applied to a mobile robot or a robotic arm as an example for illustration.
  • the video is collected by the RGB-D camera.
  • each video frame contains the RGB color map or Depth depth map at different times (the depth map is the above-mentioned target frame image carrying depth information).
  • the rectangular contour recognition method is mainly completed based on depth information.
  • An example of rectangular contour recognition is shown in Figure 3.
  • the surface of each step in the figure is a rectangle, and time domain fusion is performed on the step in the depth map.
  • the resulting rectangular plane is represented by a dashed line, the final rectangular outline is represented by a black solid line, and the vertices of the rectangular outline are represented by a cross-shaped pattern, as shown in Figure 3.
  • the rectangular plane obtained by time domain fusion refers to: the 3D edge points of the step plane in the depth map at different times are fused in the space coordinate system at a certain moment, and then the target 3D edge is extracted from the fused 3D edge points. point, the rectangular plane fitted according to the target edge point, that is, the fitting graph described in the above embodiment.
  • the rectangular plane in the target environment may be large, and the field of view of a single frame image may not cover the entire rectangular plane.
  • the fitted plane in the environment is the target rectangular plane; on the other hand, in the process of identifying the rectangular plane, the three-dimensional edge points of the rectangular plane belonging to the same object in different frame images will be fused, and on this basis Optimize and extract more precise and complete rectangular areas.
  • the spatial positions of all rectangular planes in the target environment can be determined, so that corresponding operations can be performed, which are explained according to different scenarios:
  • the solution in this embodiment can provide visual information such as the size, orientation and spatial position of the object in three-dimensional space, and the robotic arm can accurately grip the object based on the visual information. Take the target object. At the same time, the robotic arm can also place this type of object at a designated position in the target environment, control the posture of the target object after placement, and avoid collision between the target object and other objects in the target environment.
  • Scenario 2 a mobile scenario for a mobile robot
  • the mobile robot can be a footed robot. If there are difficult scenarios such as steps during the movement of the footed robot, this method can provide an accurate footing area for the footed robot in real time, so as to avoid the footed robot from stepping in the air, or to avoid the footed robot from stepping in the air.
  • the robot's feet collide with objects in the target environment during the movement.
  • the landing area can facilitate the footed robot to choose a more reasonable landing point or plan a more reasonable moving path.
  • Scenario universality the recognition of the rectangular outline in this embodiment is universal, and can be applied to most scenarios.
  • the recognition algorithm of the rectangular outline in this embodiment is robust, and there are few failures.
  • the positive direction is obtained based on the graphic code.
  • the graphic code can be Apriltag, or other graphic that can be used to indicate the direction.
  • a space coordinate system is constructed on the rectangular plane of the target object.
  • the normal of the plane is used as the z-axis of the coordinate system
  • the positive direction of the plane is used as the coordinate system.
  • the positive direction of the plane is the direction parallel to any opposite side of the rectangle.
  • the Apriltag provides a positive direction, that is, placing an Apriltag in front of the target object so that it faces parallel to a certain pair of sides of the rectangular plane of the target object, which can help improve the universality and robustness of the scene.
  • the plane fitting algorithm extracts multiple planes in the depth map as input, and the specific information is all three-dimensional point coordinates corresponding to each plane in the world coordinate system and the plane equation of the plane.
  • the current step will extract the 3D edge points corresponding to each rectangular plane, and the subsequent operations are mainly performed on these 3D edge points, otherwise The whole algorithm is difficult to achieve real-time.
  • three-dimensional edge points are mapped to two-dimensional edge points and related operations are completed.
  • the specific operation is: firstly map the 3D edge points of the object plane to the object plane fitted in S2, then construct a 2D coordinate system in the object plane, and obtain the 2D coordinates corresponding to the 3D edge points, that is, to obtain two dimensional edge points.
  • the convex polygon corresponding to the two-dimensional edge point of the object plane to represent the rectangular area of the target object, and then calculate the corresponding area s1 of the convex polygon in the two-dimensional coordinate system. Since the edge points obtained in S3 and S4 are discrete, and the error of depth information may cause the two-dimensional edge points to show irregular shapes (such as zigzag), the use of convex polygons to represent the rectangular area of the target object can be avoided. interference from the above-mentioned uncontrollable factors.
  • the left figure shows the fitting result of the conventional method. If the positive direction of the object plane is unknown, the fitted circumscribed rectangle is often quite different from the real rectangle outline.
  • the picture on the right shows the minimum circumscribed rectangle fitting that meets the expectation.
  • the fitting methods include: one solution is to use the positive direction of the rectangular plane obtained by S1, first rotate the orientation of the convex polygon to the positive direction, and then perform the circumscribed rectangle fitting. ; Another solution is to directly complete the minimum circumscribed rectangle fitting of convex polygons through corresponding algorithms. To a certain extent, the above fitting scheme can deal with the presence of errors in depth information and occlusions between planes in the scene.
  • the preset threshold value may be set according to the actual situation.
  • the preset threshold value equal to 0.9 is used as an example for description. If the difference between the object plane and the rectangle fitted by S2 is large, then the overlapping area between the convex polygon of a certain plane corresponding to S5 and S6 and the smallest circumscribed rectangle is limited. Based on such characteristics, by selecting the preset threshold value Screening for non-rectangular planes is performed at 0.9.
  • the occlusion situation is usually gradually improved when the object plane is far from the camera; on the other hand, the fusion of the rectangular plane contour in the time domain process can also be To a certain extent, the influence of occlusion is alleviated; at the same time, for complex occlusion situations, the above-mentioned preset thresholds can be adjusted accordingly.
  • the 3D edge points of the object plane of each frame are fused in the time domain, and the fused objects are multiple groups of 3D edge points corresponding to the same rectangular plane in the time domain process.
  • the 3D edge points at different times are converted to a coordinate system at a certain time, and then the 3D edge points of these discrete points are re-extracted.
  • the purpose of extracting 3D edge points is to reduce the amount of data. This step continuously fuses the rectangular plane of the target object in the time domain process, and the obtained result is close to the desired rectangular area.
  • the second is the optimization of the corresponding parameters of the rectangular plane equation, as shown in the following formula, where d represents the distance from the center point or mean point of the rectangular plane to the optical center of the camera in 3D, and w i is the weight of the parameter to be optimized corresponding to the ith frame, c i represents the parameter to be optimized (such as the normal direction) of the rectangular plane equation detected in the ith frame.
  • a regular rectangular outline is obtained, that is, the four vertices of the rectangular outline.
  • S1 and S2 provide the positive direction and the normal direction, respectively, based on which the spatial coordinate system of the rectangular plane is constructed. As shown in Figure 7, the normal direction is the z-axis, the positive direction is the x-axis, and y is the cross-product of the x-axis and the z-axis, and the x-y plane coincides with the rectangular plane.
  • S10 Determine whether the area size of the rectangular plane reaches a preset size threshold.
  • the gray area is the plane fitting area. It can be seen that the object plane fitted on the upper surface of the front cube object extends along the side of the rear cube object, resulting in Plane fitting error.
  • the left picture shows the result of rectangular plane contour recognition with outliers in the discrete points obtained by the time domain fusion in step 8
  • the right picture shows the result after the rectangular plane contour is optimized.
  • the vertices of the rectangular outline may refer to the four vertices A-D in FIG. 3 .
  • the present patent more robustly handles the depth information with errors and the scene where the plane is occluded while filtering the non-rectangular plane;
  • the final rectangular outline is more complete and accurate by fusing and optimizing the time domain information
  • a plane contour identification device is provided, and the device can adopt a software module or a hardware module, or a combination of the two to become a part of a computer device, and the device specifically includes: a first Display module 1002, overlay module 1004, deletion module 1006 and second display module 1008, wherein:
  • the first display module 1002 is used to display the target frame image obtained by collecting the target environment;
  • the superimposing module 1004 is used to superimpose a fitting graph formed by fitting the edge points of each object plane in the target frame image and the edge points of the corresponding object planes in the previous frame image on the object plane for display; the previous frame The image is the frame image collected from the target environment before the target frame image;
  • the deletion module 1006 is used to delete the edge points that do not appear in the object plane of the previous frame image in the fitting graph
  • the second display module 1008 is configured to display, on the object plane of the target frame image, the plane outline formed by the remaining edge points in the fitting graph.
  • the target frame image includes depth information; as shown in FIG. 11 , the apparatus further includes: a fitting module 1010; wherein:
  • the fitting module 1010 is used to determine the corresponding spatial position of each point in the target frame image according to the depth information; determine the plane where each point is located based on the spatial position and the plane equation, and obtain the object plane; The edge points of the corresponding object plane are fitted to obtain a fitted graph.
  • the fitting module 1010 is further configured to, when the object plane is a partial area plane of the object, or is occluded by other objects in the target frame image, determine in each previous frame image the current frame including the object plane first target frame image; extract edge points from the object plane in the previous target frame image; select the target edge point from the edge points extracted from the object plane in the previous target frame image and the edge points of the object plane in the target frame image; The selected target edge points are fitted to obtain a fitted graph.
  • the edge point is a three-dimensional edge point; as shown in FIG. 11 , the apparatus further includes: a mapping module 1012 and a determination module 1014; wherein:
  • mapping module 1012 configured to map three-dimensional edge points to two-dimensional edge points
  • the determining module 1014 is used to determine the convex polygon corresponding to the two-dimensional edge point, and calculate the area of the convex polygon; determine the circumscribed graph of the two-dimensional edge point, and calculate the area of the circumscribed graph; when the area of the convex polygon is equal to the area of the circumscribed graph When the ratio reaches the preset ratio, it is determined that the object plane is a partial area plane of the object, or is blocked by other objects in the target frame image.
  • the target frame image is obtained by collecting the target environment, and by fitting the edge points of each object plane in the target frame image and the edge points of the corresponding object plane in the previous frame image, the fitting of each object plane can be obtained. If the edge point in the fitted graph does not appear in the object plane of the previous frame image, the edge point that does not appear will be deleted as an outlier point, so as to obtain the plane outline composed of the remaining edge points, so that there is no need to Using deep learning, the plane outline of each object in the target frame image can be recognized, which reduces the training time and effectively improves the recognition efficiency of the plane outline of the target object. In addition, the non-appearing edge point is deleted as an outlier to obtain the plane outline of the object, so that multiple objects can be interleaved to affect the recognition of the plane outline, and the accuracy of the plane outline recognition is improved.
  • a plane contour identification device is provided, and the device can adopt a software module or a hardware module, or a combination of the two to become a part of a computer device, and the device specifically includes: an acquisition module 1202, a fitting module 1204, a deletion module 1206, and a building module 1208, wherein:
  • an obtaining module 1202 configured to obtain a target frame image obtained by collecting the target environment
  • the fitting module 1204 is used to fit the edge points of each object plane in the target frame image and the edge points of the corresponding object planes in the previous frame image to obtain a fitting graph;
  • the deletion module 1206 is used to delete the edge points that do not appear in the object plane of the previous frame image in the fitting graph
  • the building module 1208 is configured to identify the contour formed by the remaining edge points in the fitted graph as a plane contour.
  • the apparatus further includes: a first planning module 1210; wherein:
  • the first planning module 1210 is used to determine the moving path of the robot in each plane contour constructed by the target frame image, or select the landing point of the robot in the plane contour; move according to the moving path of the robot or the landing point of the robot.
  • the object plane is an area of the corresponding plane of the target object; as shown in FIG. 13 , the apparatus further includes: a second planning module 1212; wherein:
  • the second planning module 1212 is used for determining the size, orientation and spatial position of the target object according to the plane outline; clipping the target object based on the size, orientation and spatial position of the target object; placing the clipped target object at a specified position.
  • the apparatus further includes: a placing module 1214; wherein:
  • the acquisition module 1202 is further configured to collect an environmental image of a specified position during the process of moving the robotic arm to place the target object;
  • the fitting module 1204 is further configured to fit each target object plane in the environment image and the edge points of the corresponding target object plane in the previous frame of environment image to obtain a target fitting graph;
  • the deletion module 1206 is also used to delete edge points that do not appear in the target object plane of the previous frame environment image in the target fitting graph;
  • the previous frame environment image is the frame image collected before the environment image;
  • the building module 1208 is further configured to form the target plane outline by the remaining edge points in the target fitting graph;
  • the placing module 1214 is configured to determine the placement posture of the target object according to the contour of the target plane; and place the target object on the target object plane according to the placement posture.
  • the target frame image contains depth information; the fitting module 1204 is further configured to determine the spatial position corresponding to each point in the target frame image according to the depth information; determine the plane where each point is located based on the spatial position and the plane equation, and obtain Object plane: Fitting the edge points of the object plane and the edge points of the corresponding object plane of the previous frame image to obtain a fitted graph.
  • the target frame image includes a graphic code that carries direction information.
  • the fitting module 1204 is further configured to determine the reference direction of the coordinate system according to the direction information carried by the graphic code; construct a spatial coordinate system based on the reference direction of the coordinate system; in the spatial coordinate system, determine the corresponding point of each point in the target frame image based on the depth information Spatial location.
  • the fitting module 1204 is further configured to determine, in each previous frame image, the plane containing the object plane when the object plane is a partial area plane of the object or is blocked by other objects in the target frame image.
  • the edge points are three-dimensional edge points; as shown in FIG. 13 , the apparatus further includes: a mapping module 1216 and a determination module 1218; wherein:
  • the mapping module 1216 is used to map the three-dimensional edge points to the two-dimensional edge points;
  • the determination module 1218 is used to determine the convex polygon corresponding to the two-dimensional edge point, and calculate the area of the convex polygon; determine the circumscribed graph of the two-dimensional edge point, and calculate the area of the circumscribed graph; when the area of the convex polygon is equal to the area of the circumscribed graph When the ratio reaches the preset ratio, it is determined that the object plane is a partial area plane of the object, or is blocked by other objects in the target frame image.
  • the fitting module 1204 is further configured to determine the first weight corresponding to the target frame image and the second weight corresponding to the previous target frame image; the first weight and the second weight are not equal; in the target frame image Among the edge points of the inner object plane, the first target edge point is selected according to the first weight; And, in the edge points extracted from the object plane in the previous target frame image, the second target edge point is selected according to the second weight; A target edge point and a second target edge point are used as target edge points.
  • the determining module 1218 is further configured to determine the size of the fitted graph
  • the obtaining module 1202 is further configured to re-acquire the target frame image obtained by collecting the target environment when the size is smaller than the preset size;
  • the deletion module 1206 is further configured to delete edge points that do not appear in the object plane of the object in the previous frame image in the fitting graph when the size is greater than or equal to the preset size.
  • the target frame image is obtained by collecting the target environment, and the edge points of each object plane in the target frame image are fitted with the edge points of the corresponding object plane in the previous frame image, so as to obtain the fitting graph of each object plane.
  • the edge point in the fitting graph does not appear in the object plane of the previous frame image
  • the edge point that does not appear is deleted as an outlier point, so as to obtain the plane outline composed of the remaining edge points, reducing the The time-consuming training caused by the use of deep learning effectively improves the recognition efficiency of the plane outline of the target.
  • the non-appearing edge point is deleted as an outlier to obtain the plane outline of the object, so that multiple objects can be interleaved to affect the recognition of the plane outline, and the accuracy of the plane outline recognition is improved.
  • All or part of the modules in the above-mentioned plane contour recognition device can be implemented by software, hardware and combinations thereof.
  • the above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 14 .
  • the computer device includes a processor, memory, and a network interface connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the nonvolatile storage medium stores an operating system, a computer program, and a database.
  • the internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium.
  • the computer device's database is used to store image data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program when executed by the processor, implements a plane contour recognition method.
  • a computer device is provided, and the computer device may also be a terminal, and its internal structure diagram may be as shown in FIG. 15 .
  • the computer equipment includes a processor, memory, a communication interface, a display screen, and an input device connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the nonvolatile storage medium stores an operating system and a computer program.
  • the internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium.
  • the communication interface of the computer device is used for wired or wireless communication with an external terminal, and the wireless communication can be realized by WIFI, operator network, NFC (Near Field Communication) or other technologies.
  • the computer program when executed by the processor, implements a plane contour recognition method.
  • the display screen of the computer equipment may be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment may be a touch layer covered on the display screen, or a button, a trackball or a touchpad set on the shell of the computer equipment , or an external keyboard, trackpad, or mouse.
  • FIGS. 14 and 15 are only block diagrams of partial structures related to the solution of the present application, and do not constitute a limitation on the computer equipment to which the solution of the present application is applied.
  • a device may include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • a computer device including a memory and a processor, where a computer program is stored in the memory, and the processor implements the steps in the foregoing method embodiments when the processor executes the computer program.
  • a computer-readable storage medium which stores a computer program, and when the computer program is executed by a processor, implements the steps in the foregoing method embodiments.
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the steps in the foregoing method embodiments.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory, or optical memory, and the like.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • the RAM may be in various forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

本申请涉及一种平面轮廓识别方法、装置、计算机设备和存储介质。所述方法包括:显示对目标环境进行采集所得的目标帧图像(S202);将对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的拟合图形,叠加于所述物体平面上展示;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像(S204);在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除(S206);在所述目标帧图像的物体平面上,显示通过所述拟合图形中的剩余边缘点构成的平面轮廓(S208)。

Description

平面轮廓识别方法、装置、计算机设备和存储介质
本申请要求于2020年09月01日提交中国专利局,申请号为2020109016477,发明名称为“平面轮廓识别方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别是涉及一种平面轮廓识别方法、装置、计算机设备和存储介质。
背景技术
随着机械化技术的不断发展,越来越多的智能设备投入到相应领域进行应用,而这些智能设备在应用过程中,需要对目标环境中目标物的空间位置和大小进行有效识别,然后才能做出相应的操作。如快递领域的分拣场景中,智能机器人需要识别出待分拣物品的空间位置和大小,然后可以准确夹取待分拣物品进行分拣。
传统方案中,通常是采用深度学习的方法对目标物的平面轮廓进行识别,从而利用识别的平面轮廓确定目标物具体的空间位置和大小。然而,采用传统方案中的深度学习方法识别目标物的平面轮廓,其识别准确性往往过度依赖于预训练,而且训练耗时长,从而影响目标物平面轮廓识别的效率。
发明内容
根据本申请的各种实施例,提供了一种平面轮廓识别方法、装置、计算机设备和存储介质。
一种平面轮廓识别方法,由计算机设备执行,所述方法包括:
显示对目标环境进行采集所得的目标帧图像;
将对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的拟合图形,叠加于所述物体平面上展示;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
在所述目标帧图像的物体平面上,显示通过所述拟合图形中的剩余边缘点构成的平面轮廓。
一种平面轮廓识别装置,所述装置包括:
第一显示模块,用于显示对目标环境进行采集所得的目标帧图像;
叠加模块,用于将对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的拟合图形,叠加于所述物体平面上展示;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
删除模块,用于在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
第二显示模块,用于在所述目标帧图像的物体平面上,显示通过所述拟合图形中的剩余 边缘点构成的平面轮廓。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现以下步骤:
显示对目标环境进行采集所得的目标帧图像;
将对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的拟合图形,叠加于所述物体平面上展示;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
在所述目标帧图像的物体平面上,显示通过所述拟合图形中的剩余边缘点构成的平面轮廓。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现以下步骤:
显示对目标环境进行采集所得的目标帧图像;
将对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的拟合图形,叠加于所述物体平面上展示;
在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
在所述目标帧图像的物体平面上,显示通过所述拟合图形中的剩余边缘点构成的平面轮廓。
一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机指令,所述计算机指令存储在计算机可读存储介质中;计算机设备的处理器从所述计算机可读存储介质读取并执行所述计算机指令时,使得所述计算机设备执行上述平面轮廓识别方法的步骤。
一种平面轮廓识别方法,由计算机设备执行,所述方法包括:
获取对目标环境进行采集所得的目标帧图像;
对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,得到拟合图形;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
将所述拟合图形中的剩余边缘点构成的轮廓识别为平面轮廓。
一种平面轮廓识别装置,所述装置包括:
获取模块,用于获取对目标环境进行采集所得的目标帧图像;
拟合模块,用于对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,得到拟合图形;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
删除模块,用于在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
构建模块,用于将所述拟合图形中的剩余边缘点构成的轮廓识别为平面轮廓。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现以下步骤:
获取对目标环境进行采集所得的目标帧图像;
对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,得到拟合图形;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
将所述拟合图形中的剩余边缘点构成的轮廓识别为平面轮廓。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现以下步骤:
获取对目标环境进行采集所得的目标帧图像;
对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,得到拟合图形;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
将所述拟合图形中的剩余边缘点构成的轮廓识别为平面轮廓。
一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机指令,所述计算机指令存储在计算机可读存储介质中;计算机设备的处理器从所述计算机可读存储介质读取并执行所述计算机指令时,使得所述计算机设备执行上述平面轮廓识别方法的步骤。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
图1为一个实施例中平面轮廓识别方法的应用环境图;
图2为一个实施例中平面轮廓识别方法的流程示意图;
图3为一个实施例中在目标帧图像上显示拟合图形和平面轮廓的示意图;
图4为一个实施例中对两个交错放置的立方体物体拟合而成的平面轮廓的示意图;
图5为一个实施例中对两个交错放置的立方体物体的拟合图形进行优化,根据优化后的拟合图形拟合平面轮廓的示意图;
图6为另一个实施例中平面轮廓识别方法的流程示意图;
图7为一个实施例中构建空间坐标系的示意图;
图8为另一个实施例中平面轮廓识别方法的流程示意图;
图9为一个实施例中未使用Apriltag拟合而成的矩形轮廓和使用Apriltag拟合而成的矩形轮廓之间对比的示意图;
图10为一个实施例中平面轮廓识别装置的结构框图;
图11为另一个实施例中平面轮廓识别装置的结构框图;
图12为另一个实施例中平面轮廓识别装置的结构框图;
图13为另一个实施例中平面轮廓识别装置的结构框图;
图14为一个实施例中计算机设备的内部结构图;
图15为另一个实施例中计算机设备的内部结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的平面轮廓识别方法,可以应用于如图1所示的应用环境中。在该应用环境中,包括终端102和服务器104。终端102采集目标环境得到目标帧图像,将采集的目标帧 图形进行显示;终端102对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合得到拟合图形,然后将拟合图形叠加于物体平面上展示;在拟合图形中将未在在先帧图像中物体的物体平面内出现的边缘点删除,该在先帧图像是在目标帧图像前对目标环境采集的帧图像;在目标帧图像的物体平面上,显示通过拟合图形中的剩余边缘点构成的平面轮廓。
此外,拟合图形的拟合过程,可以包括:终端102也可以将目标帧图像和在先帧图像发送给服务器104,由服务器104对目标帧图像中各物体平面的边缘点进行拟合得到拟合图形,然后将拟合图形发送至终端102以叠加于物体平面上展示。
其中,终端102可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。
服务器104可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群,可以是提供云服务器、云数据库、云存储和CDN等基础云计算服务的云服务器。
终端102与服务器104之间可以通过蓝牙、USB(Universal Serial Bus,通用串行总线)或者网络等通讯连接方式进行连接,本申请在此不做限制。
在一个实施例中,如图2所示,提供了一种平面轮廓识别方法,该方法可由终端102执行,或由终端102和服务器104协同执行,以该方法由图1中的终端102执行为例进行说明,包括以下步骤:
S202,显示对目标环境进行采集所得的目标帧图像。
其中,目标环境可以是终端所处的工作环境,该目标环境如快递领域中机器人进行物品分拣的环境,又如机器人在工作过程中行走时的道路环境。
目标帧图像可以指终端通过内置的摄像头采集目标环境的图像或视频所得到的图像,或者通过独立的摄像头采集目标环境的图像或视频所得到的图像。目标帧图像可以是携带深度信息的三维图像,如深度图像。在实际应用中,目标帧图像可以是目标环境中阶梯的图像,如图3所示。
在一个实施例中,若终端内置的摄像头实时采集目标环境得到视频时,对该视频进行解码得到目标帧图像,然后将解码所得的各目标帧图像进行显示。或者,若独立的摄像头实时采集目标环境得到视频时,将该视频实时发送至终端,终端对该视频进行解码得到目标帧图像,然后将解码所得的各目标帧图像进行显示。其中,该目标帧图像为该视频中的视频帧。
在另一个实施例中,若终端内置的摄像头按照预设时间间隔拍摄目标环境得到目标帧图像时,将拍摄的目标帧图像在显示屏进行显示。或者,若独立的摄像头按照预设时间间隔拍摄目标环境得到目标帧图像时,将拍摄的目标帧图像发送给终端,终端将接收的目标帧图像进行显示。
S204,将对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的拟合图形,叠加于物体平面上展示。
其中,该物体平面可以指显示于目标帧图像中的物体的表面,也即未被遮挡的物体的表面;该物体平面除了在目标帧图像中出现,也可以出现于在先帧图像中。此外,该物体平面可以水平平面或曲线平面。在后续实施例中,以物体平面为水平的物体平面为例进行说明。
物体可以指目标环境中的、且拍摄在目标帧图像中的目标物,该物体各个方向上的表面可以是水平平面;此外,该物体可以具有特定几何形状。例如,该物体可以是用于装快递的快递箱,或者是阶梯等等。
拟合图形可以指:利用曲线对边缘点进行拟合所得的闭合曲线图,或利用直线对边缘点进行拟合所得的闭合图形。如图3所示,虚线图形即为对阶梯某个台阶平面的边缘点拟合而 成的拟合图形。
当该目标帧图像是三维图像,对应的边缘点也可以是目标帧图像中物体平面的体素,或者是目标环境(即真实场景)中目标物体的三维边缘点。当该目标帧图像是二维图像,对应的边缘点也可以是二维的像素,或者是目标环境中目标物体的二维边缘点。
在一个实施例中,通过叠加展示的拟合图形可以确定物体平面的边缘点的拟合程度,当拟合程度达到预设拟合条件时,可以直接将拟合图形作为物体平面的平面轮廓。当拟合程度未达到预设拟合条件时,可以执行S206。预设拟合条件可以指物体平面的拟合图形的形状是否为目标形状,例如,对于矩形物体,拟合图形是否为矩形;或者,在侧视角度下,拟合图形是否为菱形。
在一个实施例中,上述拟合图形是通过边缘点拟合步骤对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的,该边缘点拟合步骤包括:终端根据深度信息确定目标帧图像中各点对应的空间位置;基于空间位置和平面方程确定各点所在的平面,得到物体平面;对物体平面的边缘点和在先帧图像相应物体平面的边缘点进行拟合,得到拟合图形。终端将所得的拟合图形叠加于该目标帧图像的物体平面上展示。
其中,该空间位置可以是三维的空间坐标系中的空间坐标。
在一个实施例中,目标帧图像中包含携带了方向信息的图形码。上述根据深度信息确定目标帧图像中各点对应的空间位置的步骤,具体可以包括:终端根据图形码携带的方向信息确定坐标系基准方向;基于坐标系基准方向构建空间坐标系;在空间坐标系中,基于深度信息确定目标帧图像中各点对应的空间位置。
其中,该图形码可以是条形码或二维码,如用于指示方向的Apriltag。
在一个实施例中,终端根据深度信息可以计算目标帧图像中物体的各点与摄像头之间的距离,以及物体各点之间的距离,根据上述的距离可以确定目标帧图像中物体的各点在空间坐标系中的空间位置。
在一个实施例中,对于确定物体表面各点所在的平面的步骤,具体可以包括:终端将空间位置(即空间坐标)和平面方程输入至平面拟合模型,通过该平面拟合模型按照平面方程对物体各点的空间位置进行拟合,得到各点所在的平面,进而得到该平面的物体平面。
在一个实施例中,终端利用直线或曲线对物体平面的边缘点进行拟合,得到闭合曲线图,或由直线段构成的闭合图形,将该闭合曲线图或闭合图形确定为物体平面边缘点的拟合图形。
在一个实施例中,上述对物体平面的边缘点和在先帧图像相应物体平面的边缘点进行拟合,得到拟合图形的步骤,具体可以包括:当该物体平面为物体的部分区域平面,或被目标帧图像中的其它物体遮挡时,终端则在各在先帧图像中确定包含物体平面的在先目标帧图像;从在先目标帧图像中的物体平面提取边缘点;从在先目标帧图像中物体平面提取的边缘点和目标帧图像中物体平面的边缘点中,选取目标边缘点;对选取的目标边缘点进行拟合,得到拟合图形。然后,终端将所得的拟合图形叠加于该目标帧图像的物体平面上展示。
其中,上述的边缘点为三维边缘点。在先帧图像是在目标帧图像前对目标环境采集所得的帧图像,例如是目标帧图像的前一帧图像,或是在目标帧图像之前采集的前n帧图像(即前一帧图像至前第n帧图像)。对应地,在先目标帧图像是指在先帧图像中包含该物体的物体平面的图像。
对于目标边缘点的选取过程,可以包括:先将在先目标帧图像与目标帧图像进行叠加,从而使目标帧图像中物体平面的边缘点与先目标帧图像中相应物体平面的边缘点叠加在一起得到边缘点集合,然后从边缘点集合的边缘点密集区中选取边缘点;其中,从边缘点密集区中选取边缘点从而可以避免离散点的干扰,如边缘点在叠加之后,可能会出现一些偏离正常范围的边缘点,将这些边缘点排除可提高拟合效果。
此外,还可以按照摄像头与目标物体之间的距离为每帧图像设置权重,然后按照权重的大小选取目标边缘点,即从权重大的帧图像中选取较多的目标边缘点,从权重小的帧图像中选取较少的目标边缘点。
在一个实施例中,对于该物体平面是否为物体的部分区域平面,或是否被目标帧图像中的其它物体遮挡的判断步骤,具体可以包括:终端将三维边缘点映射为二维边缘点;确定二维边缘点对应的凸多边形,并计算凸多边形的面积;确定二维边缘点的外接图形,并计算外接图形的面积;当凸多边形的面积与外接图形的面积的比值达到预设比值时,则确定物体平面为物体的部分区域平面,或被目标帧图像中的其它物体遮挡。
其中,凸多边形是指:若将一个多边形所有边中的任意一条边向两方无限延长成为一直线时,其它各边都在此直线的同旁,而且该多边形内角应均不是优角,任意两个顶点间的线段位于多边形的内部或边上。
在一个实施例中,上述从在先目标帧图像中物体平面提取的边缘点和目标帧图像中物体平面的边缘点中,选取目标边缘点的步骤,具体可以包括:终端确定目标帧图像对应的第一权重和在先目标帧图像对应的第二权重;第一权重与第二权重不相等;在目标帧图像内物体平面的边缘点中,按照第一权重选取第一目标边缘点;以及,在从在先目标帧图像内物体平面提取的边缘点中,按照第二权重选取第二目标边缘点;将第一目标边缘点和第二目标边缘点作为目标边缘点。
其中,第一权重与第二权重的大小与摄像头拍摄图像时距物体之间的距离相关,即摄像头拍摄目标环境中的物体时,距离越远则对应的权重越小;同理,摄像头拍摄目标环境中的物体时,距离越近则对应的权重越大。
在一个实施例中,终端确定拟合图形的尺寸;当尺寸小于预设尺寸时,重新获取对目标环境进行采集所得的目标帧图像,然后执行S204。当尺寸大于或等于预设尺寸时,执行S206。
举例来说,考虑到现实场景的复杂程度,若目标环境中物体对应的物体平面的长和宽均在一定的范围内,在拟合过程中,可以选择对物体平面的大小进行检查,只有在满足阈值的情况下才确定拟合成功。
S206,在拟合图形中,将未在在先帧图像中的物体平面内出现的边缘点删除。
其中,在先帧图像是在目标帧图像前对目标环境采集所得的帧图像,例如是目标帧图像的前一帧图像,或是在目标帧图像之前采集的前n帧图像,或是在目标帧图像之前采集的第n帧图像。
在一个实施例中,S206具体可以包括:若拟合图形中的边缘点未在在先帧图像中的物体平面内出现,或拟合图形中的边缘点未在在先帧图像中的物体平面对应的平面轮廓内出现,则将该未出现的边缘点从拟合图形中删除。
当摄像头距离物体较远、且摄像头光心与物体的入射角较大,同时目标环境中存在多个物体交错放置时,拟合出来的拟合图形可能会存在误差,需要在拟合图形中将未在在先帧图像中的物体平面内出现的边缘点删除。
例如,如图4所示,图中的两个物体为立方体物体,其中一个物体上部分表面的拟合图形(如图中的灰色区域)沿着另一个物体的侧面延伸出去了,导致这种现象的原因是此种情况下深度信息存在较大的误差,平面拟合出错。因此,对于由多帧图像中物体平面的边缘点拟合而成的拟合图形,将该拟合图形中未于在先帧图像中的物体平面内出现的边缘点进行删除(即图中虚线椭圆框内的各边缘点删除),其中,未于在先帧图像中的物体平面内出现的边缘点为局外点,从而可以避免拟合过程中出现拟合误差,从而提高了拟合图形的正确性。
S208,在目标帧图像的物体平面上,显示通过拟合图形中的剩余边缘点构成的平面轮廓。
其中,平面轮廓可以指物体表面的轮廓,该轮廓可以是矩形轮廓,或其它形状的轮廓。
在一个实施例中,终端可以生成与拟合图形中的剩余边缘点外接的平面图形,该平面图形即为物体的平面轮廓。
例如,如图5所示,左边图中的虚线部分为原始的拟合图形,右边图中的虚线部分为将未于在先帧图像中的物体平面内出现的边缘点从拟合图形中删除之后的新拟合图形,从而根据新拟合图形构建物体的平面轮廓,因此可以提高物体平面轮廓的识别准确性。
当获得物体的平面轮廓时,可以基于该平面轮廓确定物体的尺寸和空间位置,进而终端可以进行相应的操作。其中,不同的应用场景,终端可以执行不同的操作,以以下应用场景对终端执行的操作进行阐述:
场景1,机器人移动的应用场景。
在一个实施例中,当终端为机器人,机器人在通过目标帧图像构建的各平面轮廓中确定机器人移动路径,或者在平面轮廓中选择机器人落脚点;按照机器人移动路径或机器人落脚点进行移动。
对于机器人而言,在移动过程中,需要在前方的道路(可以包括图3的阶梯)上规划移动路径。此外,对于足式机器人,在移动过程中,需要对道路上的落脚点进行选择,以避免踏空或碰撞到其它物体。
场景2,机器人夹取目标物体的应用场景。
在一个实施例中,上述的物体平面为机器臂夹取的目标物体相应平面的区域,当终端为机器臂时,机器臂根据平面轮廓确定目标物体的尺寸、朝向和空间位置;基于目标物体的尺寸、朝向和空间位置夹取目标物体;将夹取的目标物体放置于指定位置。
其中,尺寸可以指目标物体的长、宽和高中的至少两种。朝向可以指目标物体所朝方的方位,也可以指目标物体所放置的方向,如快递箱正向机器臂放置。
在一个实施例中,上述将夹取的目标物体放置于指定位置的步骤,具体可以包括:终端在移动机器臂以放置目标物体的过程中,采集指定位置的环境图像;对环境图像中各目标物体平面和在先帧环境图像中对应目标物体平面的边缘点进行拟合,得到目标拟合图形;在目标拟合图形中,将未在在先帧环境图像的目标物体平面中出现的边缘点删除;在先帧环境图像是在环境图像之前采集的帧图像;通过目标拟合图形中的剩余边缘点构成目标平面轮廓;根据目标平面轮廓确定目标物体的放置姿态;按照放置姿态,将目标物体放置于目标物体平面。
其中,放置姿态为目标物体放置的位置和朝向。
例如,若指定位置为另一物体的上方,即把目标物体放置于另一物体的上方,则可以通过上述方式识别出该另一物体对应的目标平面轮廓,然后根据该目标平面轮廓确定该另一物体的尺寸和空间位置,然后根据该另一物体的尺寸和空间位置将该目标物体放置于该另一物体上,以确保该目标物体在放置过程中避免碰撞,且放置于该目标物体的正上方。
场景3,增强现实的应用场景。
在一个实施例中,当终端为AR设备时,在获得物体的平面轮廓之后,确定该平面轮廓的尺寸和空间位置,生成关于尺寸和空间位置的虚拟标识,在该AR设备显示的关于目标环境的真实画面时,在该真实画面中该物体的附近显示该虚拟标识。
上述实施例中,采集目标环境得到目标帧图像,通过对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,从而可以获得各物体平面的拟合图形,若该拟合图形中的边缘点未出现在之前帧图像的物体平面内,则将该未出现的边缘点作为局外点进行删除,从而得到由剩余边缘点构成的平面轮廓,从而无需利用深度学习便可识别出目标帧图像中各物体的平面轮廓,减少了训练耗时,有效地提高了目标物平面轮廓的识别效率。此外,将该未出现的边缘点作为局外点进行删除得到物体的平面轮廓,从而可以多个物 体交错放置而影响平面轮廓的识别,提高了平面轮廓识别的准确性。
在一个实施例中,如图6所示,提供了另一种平面轮廓识别方法,该方法可由终端102执行,或由服务器104,或由终端102和服务器104协同执行,以该方法由图1中的终端102执行为例进行说明,包括以下步骤:
S602,获取对目标环境进行采集所得的目标帧图像。
其中,目标环境可以是终端所处的工作环境,该终端如快递领域中机器人进行物品分拣的环境,又如机器人在工作过程中行走时的道路环境。
目标帧图像可以指终端通过内置的摄像头采集目标环境的图像或视频所得到的图像,或者通过独立的摄像头采集目标环境的图像或视频所得到的图像。目标帧图像可以是携带深度信息的三维图像,如深度图像。在实际应用中,目标帧图像可以是目标环境中阶梯的图像,如图3所示。
在一个实施例中,若终端内置的摄像头实时采集目标环境得到视频时,对该视频进行解码得到目标帧图像。或者,若独立的摄像头实时采集目标环境得到视频时,将该视频实时发送至终端,终端对该视频进行解码得到目标帧图像。其中,该目标帧图像为该视频中的视频帧。
在另一个实施例中,终端通过内置的摄像头按照预设时间间隔拍摄目标环境得到目标帧图像;或者,通过独立的摄像头按照预设时间间隔拍摄目标环境得到目标帧图像,然后将拍摄的目标帧图像传输给终端。
在一个实施例中,终端在获得目标帧图像之后,可以通过其显示屏显示该目标帧图像。
S604,对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,得到拟合图形。
其中,在先帧图像是在目标帧图像前对目标环境采集所得的帧图像。该物体平面可以指显示于目标帧图像中的物体的表面,也即未被遮挡的物体的表面;此外,该物体平面可以水平平面或曲线平面。在后续实施例中,以物体平面为水平的物体平面为例进行说明。
物体可以指目标环境中的、且拍摄在目标帧图像中的目标物,该物体各个方向上的表面可以是水平平面;此外,该物体可以具有特定几何形状。例如,该物体可以是用于装快递的快递箱,或者是阶梯等等。
拟合图形可以指:利用曲线对边缘点进行拟合所得的闭合曲线图,或利用直线对边缘点进行拟合所得的闭合图形。如图3所示,虚线图形即为对阶梯某个台阶平面的边缘点拟合而成的拟合图形。由于该目标帧图像可以是三维图像,对应的边缘点也可以是三维的边缘点。
当该目标帧图像是三维图像,对应的边缘点也可以是目标帧图像中物体平面的体素,或者是目标环境(即真实场景)中目标物体的三维边缘点。当该目标帧图像是二维图像,对应的边缘点也可以是二维的像素,或者是目标环境中目标物体的二维边缘点。
在一个实施例中,终端根据深度信息确定目标帧图像中各点对应的空间位置;基于空间位置和平面方程确定各点所在的平面,得到物体平面;对物体平面的边缘点和在先帧图像相应物体平面的边缘点进行拟合,得到拟合图形。
其中,该空间位置可以是三维的空间坐标系中的空间坐标。
在一个实施例中,目标帧图像中包含携带了方向信息的图形码。上述根据深度信息确定目标帧图像中各点对应的空间位置的步骤,具体可以包括:终端根据图形码携带的方向信息确定坐标系基准方向;基于坐标系基准方向构建空间坐标系;在空间坐标系中,基于深度信息确定目标帧图像中各点对应的空间位置。
其中,该图形码可以是条形码或二维码,如用于指示方向的Apriltag。
例如,如图7所示,通过Apriltag可以确定空间坐标系的坐标基准方向(也可称为正方向),将该坐标基准方向作为x轴方向,然后将该物体对应的物体平面的法向作为z轴方向,将x轴方向和z轴方向的叉乘方向作为y轴方向。举例来说,x轴方向记为向量a,z轴方向记为向量b,向量a和向量b的叉乘方向记为a×b的方向,该a×b的方向为:四指由b开始指向a,拇指的指向就是b×a的方向,该方向垂直于b和a所在的平面。
在一个实施例中,终端根据深度信息可以计算目标帧图像中物体的各点与摄像头之间的距离,以及物体各点之间的距离,根据上述的距离可以确定目标帧图像中物体的各点在空间坐标系中的空间位置。
在一个实施例中,对于确定物体表面各点所在的平面的步骤,具体可以包括:终端将空间位置(即空间坐标)和平面方程输入至平面拟合模型,通过该平面拟合模型按照平面方程对物体各点的空间位置进行拟合,得到各点所在的平面,进而得到该平面的物体平面。
在一个实施例中,终端利用直线或曲线对物体平面的边缘点进行拟合,得到闭合曲线图,或由直线段构成的闭合图形,将该闭合曲线图或闭合图形确定为物体平面边缘点的拟合图形。
在一个实施例中,上述对物体平面的边缘点和在先帧图像相应物体平面的边缘点进行拟合,得到拟合图形的步骤,具体可以包括:当该物体平面为物体的部分区域平面,或被目标帧图像中的其它物体遮挡时,终端则在各在先帧图像中确定包含物体平面的在先目标帧图像;从在先目标帧图像中的物体平面提取边缘点;从在先目标帧图像中物体平面提取的边缘点和目标帧图像中物体平面的边缘点中,选取目标边缘点;对选取的目标边缘点进行拟合,得到拟合图形。
其中,上述的边缘点为三维边缘点。在先帧图像是在目标帧图像前对目标环境采集所得的帧图像,例如是目标帧图像的前一帧图像,或是在目标帧图像之前采集的前n帧图像(即前一帧图像至前第n帧图像)。对应地,在先目标帧图像是指在先帧图像中包含该物体的物体平面的图像。
对于目标边缘点的选取过程,可以包括:先将在先目标帧图像与目标帧图像进行叠加,从而使目标帧图像中物体平面的边缘点与先目标帧图像中相应物体平面的边缘点叠加在一起得到边缘点集合,然后从边缘点集合的边缘点密集区中选取边缘点;其中,从边缘点密集区中选取边缘点从而可以避免离散点的干扰,如边缘点在叠加之后,可能会出现一些偏离正常范围的边缘点,将这些边缘点排除可提高拟合效果。
此外,还可以按照摄像头与目标物体之间的距离为每帧图像设置权重,然后按照权重的大小选取目标边缘点,即从权重大的帧图像中选取较多的目标边缘点,从权重小的帧图像中选取较少的目标边缘点。
在一个实施例中,对于该物体平面是否为物体的部分区域平面,或是否被目标帧图像中的其它物体遮挡的判断步骤,具体可以包括:终端将三维边缘点映射为二维边缘点;确定二维边缘点对应的凸多边形,并计算凸多边形的面积;确定二维边缘点的外接图形,并计算外接图形的面积;当凸多边形的面积与外接图形的面积的比值达到预设比值时,则确定物体平面为物体的部分区域平面,或被目标帧图像中的其它物体遮挡。
其中,凸多边形是指:若将一个多边形所有边中的任意一条边向两方无限延长成为一直线时,其它各边都在此直线的同旁,而且该多边形内角应均不是优角,任意两个顶点间的线段位于多边形的内部或边上。
在一个实施例中,上述从在先目标帧图像中物体平面提取的边缘点和目标帧图像中物体平面的边缘点中,选取目标边缘点的步骤,具体可以包括:终端确定目标帧图像对应的第一权重和在先目标帧图像对应的第二权重;第一权重与第二权重不相等;在目标帧图像内物体平面的边缘点中,按照第一权重选取第一目标边缘点;以及,在从在先目标帧图像内物体平 面提取的边缘点中,按照第二权重选取第二目标边缘点;将第一目标边缘点和第二目标边缘点作为目标边缘点。
其中,第一权重与第二权重的大小与摄像头拍摄图像时距物体之间的距离相关,即摄像头拍摄目标环境中的物体时,距离越远则对应的权重越小;同理,摄像头拍摄目标环境中的物体时,距离越近则对应的权重越大。
在一个实施例中,终端确定拟合图形的尺寸;当尺寸小于预设尺寸时,重新获取对目标环境进行采集所得的目标帧图像,然后执行S604。当尺寸大于或等于预设尺寸时,执行S606。
举例来说,考虑到现实场景的复杂程度,若目标环境中物体对应的物体平面的长和宽均在一定的范围内,在拟合过程中,可以选择对物体平面的大小进行检查,只有在满足阈值的情况下才确定拟合成功。
S606,在拟合图形中,将未在在先帧图像的物体平面内出现的边缘点删除。
其中,在先帧图像是在目标帧图像前对目标环境采集所得的帧图像,例如是目标帧图像的前一帧图像,或是在目标帧图像之前采集的前n帧图像,或是在目标帧图像之前采集的第n帧图像。
在一个实施例中,S606具体可以包括:若拟合图形中的边缘点未在在先帧图像中的物体平面内出现,或拟合图形中的边缘点未在在先帧图像中的物体平面对应的平面轮廓内出现,则将该未出现的边缘点从拟合图形中删除。
当摄像头距离物体较远、且摄像头光心与物体的入射角较大,同时目标环境中存在多个物体交错放置时,拟合出来的拟合图形可能会存在误差,需要在拟合图形中将未在在先帧图像中的物体平面内出现的边缘点删除。
例如,如图4所示,图中的两个物体为立方体物体,其中一个物体上部分表面的拟合图形(如图中的灰色区域)沿着另一个物体的侧面延伸出去了,导致这种现象的原因是此种情况下深度信息存在较大的误差,平面拟合出错。因此,对于由多帧图像中物体平面的边缘点拟合而成的拟合图形,将该拟合图形中未于在先帧图像中的物体平面内出现的边缘点进行删除(即图中虚线椭圆框内的各边缘点删除),其中,未于在先帧图像中的物体平面内出现的边缘点为局外点,从而可以避免拟合过程中出现拟合误差,从而提高了拟合图形的正确性。
S608,将拟合图形中的剩余边缘点构成的轮廓识别为平面轮廓。
其中,平面轮廓可以指目标物体的物体平面的轮廓,该轮廓可以是矩形轮廓,或其它形状的轮廓。
在一个实施例中,终端可以生成与拟合图形中的剩余边缘点外接的平面图形,该平面图形即为物体的平面轮廓。
例如,如图5所示,左边图中的虚线部分为原始的拟合图形,右边图中的虚线部分为将未于在先帧图像中的物体平面内出现的边缘点从拟合图形中删除之后的新拟合图形,从而根据新拟合图形构建物体的平面轮廓,因此可以提高物体平面轮廓的识别准确性。
当获得物体的平面轮廓时,可以基于该平面轮廓确定物体的尺寸和空间位置,进而终端可以进行相应的操作。其中,不同的应用场景,终端可以执行不同的操作,以以下应用场景对终端执行的操作进行阐述:
场景1,机器人移动的应用场景。
在一个实施例中,当终端为机器人,机器人在通过目标帧图像构建的各平面轮廓中确定机器人移动路径,或者在平面轮廓中选择机器人落脚点;按照机器人移动路径或机器人落脚点进行移动。
对于机器人而言,在移动过程中,需要在前方的道路(可以包括图3的阶梯)上规划移动路径。此外,对于足式机器人,在移动过程中,需要对道路上的落脚点进行选择,以避免 踏空或碰撞到其它物体。
场景2,机器人夹取目标物体的应用场景。
在一个实施例中,上述的物体平面为目标物体相应平面的区域,当终端为机器臂时,机器臂根据平面轮廓确定目标物体的尺寸、朝向和空间位置;基于目标物体的尺寸、朝向和空间位置夹取目标物体;将夹取的目标物体放置于指定位置。
其中,尺寸可以指目标物体的长、宽和高中的至少两种。朝向可以指目标物体所朝方的方位,也可以指目标物体所放置的方向,如快递箱正向机器臂放置。
在一个实施例中,上述将夹取的目标物体放置于指定位置的步骤,具体可以包括:终端在移动机器臂以放置目标物体的过程中,采集指定位置的环境图像;对环境图像中各目标物体平面和在先帧环境图像中对应目标物体平面的边缘点进行拟合,得到目标拟合图形;在目标拟合图形中,将未在在先帧环境图像的目标物体平面中出现的边缘点删除;在先帧环境图像是在环境图像之前采集的帧图像;通过目标拟合图形中的剩余边缘点构成目标平面轮廓;根据目标平面轮廓确定目标物体的放置姿态;按照放置姿态,将目标物体放置于目标物体平面。
其中,放置姿态为目标物体放置的位置和朝向。
例如,若指定位置为另一物体的上方,即把目标物体放置于另一物体的上方,则可以通过上述方式识别出该另一物体对应的目标平面轮廓,然后根据该目标平面轮廓确定该另一物体的尺寸和空间位置,然后根据该另一物体的尺寸和空间位置将该目标物体放置于该另一物体上,以确保该目标物体在放置过程中避免碰撞,且放置于该目标物体的正上方。
场景3,增强现实的应用场景。
在一个实施例中,当终端为AR设备时,在获得物体的平面轮廓之后,确定该平面轮廓的尺寸和空间位置,生成关于尺寸和空间位置的虚拟标识,在该AR设备显示的关于目标环境的真实画面时,在该真实画面中该物体的附近显示该虚拟标识。
上述实施例中,采集目标环境得到目标帧图像,对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,从而可以获得各物体平面的拟合图形,若该拟合图形中的边缘点未出现在之前帧图像的物体平面内,则将该未出现的边缘点作为局外点进行删除,从而得到由剩余边缘点构成的平面轮廓,减少了因采用深度学习而导致的训练耗时,有效地提高了目标物平面轮廓的识别效率。此外,将该未出现的边缘点作为局外点进行删除得到物体的平面轮廓,从而可以多个物体交错放置而影响平面轮廓的识别,提高了平面轮廓识别的准确性。
作为一个示例,以上述方法应用于移动机器人或机械臂为例进行阐述。通过RGB-D相机采集视频,对于RGB-D相机采集的视频,各视频帧包含不同时刻的RGB彩色图或Depth深度图(该深度图即为上述的携带深度信息的目标帧图像)。
在本实施例中,矩形轮廓识别方法主要基于深度信息完成,一个矩形轮廓识别的例子如图3所示,图中的每个台阶的表面为矩形,在深度图中对该台阶进行时域融合所得的矩形平面用虚线表示,最终的矩形轮廓通过黑色的实线表示,矩形轮廓的顶点由叉字形图案表示,具体如图所示3。
其中,时域融合所得的矩形平面指的是:将不同时刻的深度图中该台阶平面的三维边缘点融合在某一时刻的空间坐标系下,然后从融合的三维边缘点中提取目标三维边缘点,根据该目标边缘点拟合出来的矩形平面,即上述实施例中所述的拟合图形。
需要注意的是,目标环境中的矩形平面可能较大,单个帧图像的视野可能不能覆盖整个矩形平面,为了应对这种情况,在本实施例中,一方面会逐帧实时地识别和筛选目标环境中 拟合出来的平面是否为目标的矩形平面;另一方面,在识别矩形平面的过程中会将不同帧图像中的属于同一物体的矩形平面的三维边缘点进行融合,并在此基础上优化并提取出更为精确且完整的矩形区域。
当识别出目标环境中的矩形轮廓,便可以确定目标环境中所有矩形平面的空间位置,从而可以进行相应的操作,该操作按不同场景进行阐述:
场景1,对物品的分拣场景
若机械臂需要夹取或移动立方体类型的物体,通过本实施例的方案可以提供该类物体的尺寸、朝向及其在三维空间中的空间位置等视觉信息,机械臂基于这些视觉信息可以准确夹取该目标物体。同时,机械臂还可以将该类物体放置到目标环境的指定位置,控制放置后目标物体的姿态,避免该目标物体与目标环境中其它物体的碰撞。
场景2,针对移动机器人的移动场景
该移动机器人可以是足式机器人,若在足式机器人移动过程中存在台阶等高难度场景,该方法可以实时的为足式机器人提供精确的落脚区域,避免足式机器人踏空,或避免足式机器人的足部在移动过程中与目标环境中的物体产生碰撞。
此外,落脚区域可以方便足式机器人选择更为合理的落脚点,或者规划更为合理的移动路径。
通过本实施例的方案,可以具备以下技术效果:
实时性,可以在计算平台上实时逐帧的识别和筛选拟合出来的矩形平面是否为目标物体真实的矩形平面,同时矩形轮廓的识别也是实时的。
准确性,矩形轮廓的识别是准确的。
场景普适性,本实施例中的矩形轮廓的识别是普适的,可以适用大部分场景。
算法鲁棒性,本实施例中的矩形轮廓的识别算法是鲁棒的,失败的情况很少。
作为另一个示例,通过实验发现矩形轮廓的识别过程需要面临一些核心问题,具体如下:1)矩形平面并不是一个中心对称图形,需要获取矩形平面的正方向。2)在运算资源有限的情况下,三维(3D)数据的处理往往需要一系列复杂的操作,对实时性是一个挑战。3)对整个场景完成平面拟合之后,会有各类拟合的平面被提取出来,需要筛选出目标矩形平面。4)如果场景中的矩形物体平面较大,单次的矩形轮廓识别只能获得物体的一部分矩形平面,需要尽可能地获得更加完整的矩形区域。5)对复杂的场景而言,当RGB-D相机距离目标平面较远且相机光心与平面的入射角较大的时候,深度信息的误差随之增大,识别出来的矩形轮廓往往会出错,需要进一步优化。
为了平衡有限的计算资源以及矩形平面轮廓识别精度的问题,本实施例提出了一个合理的解决方案,流程图如图8所示:
S1,基于图形码获取正方向。
其中,该图形码可以是Apriltag,或者是其它可用于指示方向的图形。以Apriltag为例,通过Apriltag获取到正方向之后,在目标物体的矩形平面构建空间坐标系,如图7所示,平面的法线作为该坐标系的z轴,平面的正方向作为该坐标系的x轴,平面的正方向即为与矩形任一对边平行的方向。
本实施例通过Apriltag提供正方向,即在目标物体前放置一个Apriltag,使其朝向与目标物体的矩形平面的某一对边平行,可以有利于提高场景的普适性和鲁棒性。
S2,对单帧的深度图进行平面拟合。
通过平面拟合算法提取出深度图中的多个平面作为输入,具体的信息为世界坐标系下每个平面对应的所有三维点坐标以及该平面的平面方程。
S3,提取该物体平面的三维边缘点。
考虑到S2中拟合出来的单个矩形平面包含较多的三维点,为节约计算资源,当前步骤会提取出每个矩形平面对应的三维边缘点,后续操作主要在这些三维边缘点上进行,否则整个算法是很难达到实时的。
S4,将三维边缘点映射为二维边缘点。
由于三维数据的处理相对二维数据往往更为复杂,针对涉及复杂运算的模块,本实施例中将三维边缘点映射至二维边缘点并完成相关运算。具体的操作是:首先将物体平面的三维边缘点映射到S2中拟合出来的物体平面上,之后在该物体平面内构建二维坐标系,获取三维边缘点对应的二维坐标,即得到二维边缘点。
S5,获取二维边缘点对应的凸多边形,计算该凸多边形的面积s1。
使用物体平面的二维边缘点对应的凸多边形表示目标物体的矩形区域,然后计算该凸多边形在二维坐标系下对应的面积s1。由于在S3和S4中获取的边缘点是离散的,同时深度信息的误差可能会导致二维边缘点呈现不规则的形状(如锯齿状),采用凸多边形表示目标物体的矩形区域的方式可以避免上述不可控因素的干扰。
S6,拟合凸多边形最小的外接矩形,并计算该外接矩形的面积s2。
如图9所示,左图表示常规方法的拟合结果,如果物体平面的正方向未知,拟合出来的外接矩形往往与真实的矩形轮廓差异较大。
右图为满足预期的最小外接矩形拟合,拟合的方式包括:一种方案是借助S1获取的矩形平面正方向,首先将凸多边形的朝向旋转至正方向,之后进行外接矩形拟合即可;另一种方案是通过相应算法直接完成对凸多边形的最小外接矩形拟合。上述拟合方案在一定程度上可以处理深度信息存在误差以及场景中平面之间存在遮挡的情况。
S7,判断s1与s2的比值是否大于预设阈值。
其中,该预设阈值可以根据实际情况进行设定,这里以预设阈值等于0.9为例进行阐述。若S2拟合出的物体平面与矩形差别较大,那么S5和S6对应的某个平面的凸多边形和最小的外接矩形之间的重合面积是有限的,基于这样的特点,通过选取预设阈值为0.9进行非矩形平面的筛选。
考虑物体平面被遮挡的情况,一方面,在时域过程中,物体平面距离相机从远到近的过程中遮挡情况通常会逐步改善;另一方面时域过程中矩形平面轮廓的融合也可以在一定程度上缓解遮挡的影响;同时,针对复杂的遮挡情况,可以对应调整上述的预设阈值。
S8,对不同帧的深度图对应的三维边缘点进行融合。
首先,对于不同帧的深度图,将各帧的物体平面的三维边缘点进行时域融合,融合的对象为同一个矩形平面在时域过程中对应的多组三维边缘点。基于SLAM的结果,将不同时刻的三维边缘点转换至某一时刻坐标系下,之后重新提取这些离散点的三维边缘点,提取三维边缘点的目的是减少数据量。此步骤在时域过程中不断融合目标物体的矩形平面,得到的结果接近所需的矩形区域。
其次是对矩形平面方程对应参数的优化,具体见如下公式,其中,d表示3D下矩形平面中心点或均值点距离相机光心的距离,w i为第i帧对应的待优化参数的权重,c i表示第i帧检测到的矩形平面方程的待优化参数(如法向)。
Figure PCTCN2021114064-appb-000001
Figure PCTCN2021114064-appb-000002
S9,构建矩形平面的坐标系并提取矩形轮廓。
基于S8的结果获取规则的矩形轮廓,即矩形轮廓的四个顶点。S1和S2分别提供了正方向和法向,基于此构建矩形平面的空间坐标系。如图7所示,其中,法向为z轴,正方向为x轴,y为x轴和z轴的叉乘结果,同时x-y平面与矩形平面重合。
接下来,将矩形平面对应的三维点投影到x-y平面,得到x min,x max,y min,y max,那么该矩形轮廓的4个顶点坐标即为(x min,y min),(x min,y ma),(x max,y min)以及(x ma,y max),每个顶点对应的z值可以通过平面方程获取。
S10,判断矩形平面的区域尺寸是否达到预设尺寸阈值。
考虑到现实场景的复杂程度,若目标物体的矩形平面长和宽在一定的范围内,可以选择对矩形平面的大小进行检查,只有在满足阈值的情况下才认为识别成功,从而得到矩形平面对应的矩形轮廓。
S11,矩形轮廓的优化。
针对复杂但常见的场景,比如相机距离某个矩形平面较远,且相机光心与矩形平面的入射角较大,同时场景中存在多个物体交错放置的情况。如图4所示,以两个立方体为例,此时灰色区域为平面拟合区域,可以看出,前方立方体物体上表面拟合出来的物体平面沿着后方立方体物体的侧面延伸出去了,导致平面拟合出错。本实施例通过遍历S8时域融合得到的离散点,筛选掉那些在当前帧相机视野内但不在S3对应的单帧图像的矩形平面轮廓范围内的离散点,这些离散点就是outliers。如图5所示,左图表示步骤8时域融合得到的离散点中存在outliers的矩形平面轮廓识别结果,右图为矩形平面轮廓优化之后的结果。
S12,获取矩形轮廓顶点的三维坐标。
其中,矩形轮廓顶点可以参考图3中的A-D四个顶点。
通过上述实施例的方案,可以具有以下技术效果:
可以兼顾实时性和精确度,通过3D和2D的映射过程极大地降低了运算复杂度,同时平面轮廓的提取极大地降低了运算的数据量,使得算法可以实时地识别视频中的矩形平面轮廓;
另一方面,本专利通过引入最小外接矩形,在过滤非矩形平面的同时更为鲁棒的处理带有误差的深度信息以及平面存在遮挡的场景;
针对大面积的矩形区域,通过对时域信息进行融合和优化使得最终的矩形轮廓更为完整和精确;
针对复杂场景,通过矩形轮廓的优化,极大的缓解了平面的错误拟合对矩形轮廓识别的影响,而且整个算法可以实时且鲁棒的稳定运行。
应该理解的是,虽然图2、6、8的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2、6、8中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图10所示,提供了一种平面轮廓识别装置,该装置可以采用软件模块或硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:第一显示模块1002、叠加模块1004、删除模块1006和第二显示模块1008,其中:
第一显示模块1002,用于显示对目标环境进行采集所得的目标帧图像;
叠加模块1004,用于将对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的拟合图形,叠加于物体平面上展示;在先帧图像是在目标帧图像 前对目标环境采集所得的帧图像;
删除模块1006,用于在拟合图形中,将未在在先帧图像的物体平面内出现的边缘点删除;
第二显示模块1008,用于在目标帧图像的物体平面上,显示通过拟合图形中的剩余边缘点构成的平面轮廓。
在一个实施例中,该目标帧图像包含深度信息;如图11所示,该装置还包括:拟合模块1010;其中:
拟合模块1010,用于根据深度信息确定目标帧图像中各点对应的空间位置;基于空间位置和平面方程确定各点所在的平面,得到物体平面;对物体平面的边缘点和在先帧图像相应物体平面的边缘点进行拟合,得到拟合图形。
在一个实施例中,拟合模块1010,还用于当物体平面为物体的部分区域平面,或被目标帧图像中的其它物体遮挡时,则在各在先帧图像中确定包含物体平面的在先目标帧图像;从在先目标帧图像中的物体平面提取边缘点;从在先目标帧图像中物体平面提取的边缘点和目标帧图像中物体平面的边缘点中,选取目标边缘点;对选取的目标边缘点进行拟合,得到拟合图形。
在一个实施例中,该边缘点为三维边缘点;如图11所示,该装置还包括:映射模块1012和确定模块1014;其中:
映射模块1012,用于将三维边缘点映射为二维边缘点;
确定模块1014,用于确定二维边缘点对应的凸多边形,并计算凸多边形的面积;确定二维边缘点的外接图形,并计算外接图形的面积;当凸多边形的面积与外接图形的面积的比值达到预设比值时,则确定物体平面为物体的部分区域平面,或被目标帧图像中的其它物体遮挡。
上述实施例中,采集目标环境得到目标帧图像,通过对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,从而可以获得各物体平面的拟合图形,若该拟合图形中的边缘点未出现在之前帧图像的物体平面内,则将该未出现的边缘点作为局外点进行删除,从而得到由剩余边缘点构成的平面轮廓,从而无需利用深度学习便可识别出目标帧图像中各物体的平面轮廓,减少了训练耗时,有效地提高了目标物平面轮廓的识别效率。此外,将该未出现的边缘点作为局外点进行删除得到物体的平面轮廓,从而可以多个物体交错放置而影响平面轮廓的识别,提高了平面轮廓识别的准确性。
在一个实施例中,如图12所示,提供了一种平面轮廓识别装置,该装置可以采用软件模块或硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:获取模块1202、拟合模块1204、删除模块1206和构建模块1208,其中:
获取模块1202,用于获取对目标环境进行采集所得的目标帧图像;
拟合模块1204,用于对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,得到拟合图形;在先帧图像是在目标帧图像前对目标环境采集所得的帧图像;
删除模块1206,用于在拟合图形中,将未在在先帧图像的物体平面内出现的边缘点删除;
构建模块1208,用于将拟合图形中的剩余边缘点构成的轮廓识别为平面轮廓。
在一个实施例中,如图13所示,该装置还包括:第一规划模块1210;其中:
第一规划模块1210,用于在通过目标帧图像构建的各平面轮廓中确定机器人移动路径,或者在平面轮廓中选择机器人落脚点;按照机器人移动路径或机器人落脚点进行移动。
在一个实施例中,物体平面为目标物体相应平面的区域;如图13所示,该装置还包括:第二规划模块1212;其中:
第二规划模块1212,用于根据平面轮廓确定目标物体的尺寸、朝向和空间位置;基于目标物体的尺寸、朝向和空间位置夹取目标物体;将夹取的目标物体放置于指定位置。
在一个实施例中,如图13所示,该装置还包括:放置模块1214;其中:
获取模块1202,还用于在移动机器臂以放置目标物体的过程中,采集指定位置的环境图像;
拟合模块1204,还用于对环境图像中各目标物体平面和在先帧环境图像中对应目标物体平面的边缘点进行拟合,得到目标拟合图形;
删除模块1206,还用于在目标拟合图形中,将未在在先帧环境图像的目标物体平面中出现的边缘点删除;在先帧环境图像是在环境图像之前采集的帧图像;
构建模块1208,还用于通过目标拟合图形中的剩余边缘点构成目标平面轮廓;
放置模块1214,用于根据目标平面轮廓确定目标物体的放置姿态;按照放置姿态,将目标物体放置于目标物体平面。
在一个实施例中,目标帧图像包含深度信息;拟合模块1204,还用于根据深度信息确定目标帧图像中各点对应的空间位置;基于空间位置和平面方程确定各点所在的平面,得到物体平面;对物体平面的边缘点和在先帧图像相应物体平面的边缘点进行拟合,得到拟合图形。
在一个实施例中,目标帧图像中包含携带了方向信息的图形码。该拟合模块1204,还用于根据图形码携带的方向信息确定坐标系基准方向;基于坐标系基准方向构建空间坐标系;在空间坐标系中,基于深度信息确定目标帧图像中各点对应的空间位置。
在一个实施例中,该拟合模块1204,还用于当物体平面为物体的部分区域平面,或被目标帧图像中的其它物体遮挡时,则在各在先帧图像中确定包含物体平面的在先目标帧图像;从在先目标帧图像中的物体平面提取边缘点;从在先目标帧图像中物体平面提取的边缘点和目标帧图像中物体平面的边缘点中,选取目标边缘点;对选取的目标边缘点进行拟合,得到拟合图形。
在一个实施例中,边缘点为三维边缘点;如图13所示,该装置还包括:映射模块1216和确定模块1218;其中:
映射模块1216,用于将三维边缘点映射为二维边缘点;
确定模块1218,用于确定二维边缘点对应的凸多边形,并计算凸多边形的面积;确定二维边缘点的外接图形,并计算外接图形的面积;当凸多边形的面积与外接图形的面积的比值达到预设比值时,则确定物体平面为物体的部分区域平面,或被目标帧图像中的其它物体遮挡。
在一个实施例中,该拟合模块1204,还用于确定目标帧图像对应的第一权重和在先目标帧图像对应的第二权重;第一权重与第二权重不相等;在目标帧图像内物体平面的边缘点中,按照第一权重选取第一目标边缘点;以及,在从在先目标帧图像内物体平面提取的边缘点中,按照第二权重选取第二目标边缘点;将第一目标边缘点和第二目标边缘点作为目标边缘点。
在一个实施例中,确定模块1218,还用于确定拟合图形的尺寸;
获取模块1202,还用于当尺寸小于预设尺寸时,重新获取对目标环境进行采集所得的目标帧图像;
删除模块1206,还用于当尺寸大于或等于预设尺寸时,执行在拟合图形中,将未在在先帧图像中物体的物体平面内出现的边缘点删除的步骤。
上述实施例中,采集目标环境得到目标帧图像,对目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,从而可以获得各物体平面的拟合图形,若该拟合图形中的边缘点未出现在之前帧图像的物体平面内,则将该未出现的边缘点作为局外点进行删除,从而得到由剩余边缘点构成的平面轮廓,减少了因采用深度学习而导致的训练耗 时,有效地提高了目标物平面轮廓的识别效率。此外,将该未出现的边缘点作为局外点进行删除得到物体的平面轮廓,从而可以多个物体交错放置而影响平面轮廓的识别,提高了平面轮廓识别的准确性。
关于平面轮廓识别装置的具体限定可以参见上文中对于平面轮廓识别方法的限定,在此不再赘述。上述平面轮廓识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图14所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储图像数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种平面轮廓识别方法。
在一个实施例中,提供了一种计算机设备,该计算机设备也可以是终端,其内部结构图可以如图15所示。该计算机设备包括通过系统总线连接的处理器、存储器、通信接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的通信接口用于与外部的终端进行有线或无线方式的通信,无线方式可通过WIFI、运营商网络、NFC(近场通信)或其他技术实现。该计算机程序被处理器执行时以实现一种平面轮廓识别方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图14、15中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,还提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机可读存储介质,存储有计算机程序,该计算机程序被处理器执行时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述各方法实施例中的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易 失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种平面轮廓识别方法,由计算机设备执行,其特征在于,所述方法包括:
    显示对目标环境进行采集所得的目标帧图像;
    将对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的拟合图形,叠加于所述物体平面上展示;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
    在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
    在所述目标帧图像的物体平面上,显示通过所述拟合图形中的剩余边缘点构成的平面轮廓。
  2. 根据权利要求1所述的方法,其特征在于,所述目标帧图像包含深度信息;所述拟合图形是通过边缘点拟合步骤对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的,所述边缘点拟合步骤包括:
    根据所述深度信息确定所述目标帧图像中各点对应的空间位置;
    基于所述空间位置和平面方程确定各点所在的平面,得到所述物体平面;
    对所述物体平面的边缘点和所述在先帧图像相应物体平面的边缘点进行拟合,得到所述拟合图形。
  3. 根据权利要求2所述的方法,其特征在于,所述对所述物体平面的边缘点和所述在先帧图像相应物体平面的边缘点进行拟合,得到拟合图形包括:
    当所述物体平面为物体的部分区域平面,或被所述目标帧图像中的其它物体遮挡时,则在各所述在先帧图像中确定包含所述物体平面的在先目标帧图像;
    从所述在先目标帧图像中的所述物体平面提取边缘点;
    从所述在先目标帧图像中物体平面提取的边缘点和所述目标帧图像中物体平面的边缘点中,选取目标边缘点;
    对选取的目标边缘点进行拟合,得到拟合图形。
  4. 根据权利要求3所述的方法,其特征在于,所述边缘点为三维边缘点;所述方法还包括:
    将所述三维边缘点映射为二维边缘点;
    确定所述二维边缘点对应的凸多边形,并计算所述凸多边形的面积;
    确定所述二维边缘点的外接图形,并计算所述外接图形的面积;
    当所述凸多边形的面积与所述外接图形的面积的比值达到预设比值时,则确定所述物体平面为物体的部分区域平面,或被所述目标帧图像中的其它物体遮挡。
  5. 一种平面轮廓识别方法,由计算机设备执行,其特征在于,所述方法包括:
    获取对目标环境进行采集所得的目标帧图像;
    对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,得到拟合图形;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
    在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
    将所述拟合图形中的剩余边缘点构成的轮廓识别为平面轮廓。
  6. 根据权利要求5所述的方法,其特征在于,所述方法应用于机器人;所述方法还包括:
    在通过所述目标帧图像构建的各所述平面轮廓中确定机器人移动路径,或者在所述平面轮廓中选择机器人落脚点;
    按照所述机器人移动路径或所述机器人落脚点进行移动。
  7. 根据权利要求5所述的方法,其特征在于,所述方法应用于机器臂;所述物体平面为所述机器臂夹取的目标物体相应平面的区域;所述方法还包括:
    根据所述平面轮廓确定所述目标物体的尺寸、朝向和空间位置;
    基于所述目标物体的尺寸、朝向和空间位置夹取所述目标物体;
    将夹取的所述目标物体放置于指定位置。
  8. 根据权利要求7所述的方法,其特征在于,所述将夹取的所述目标物体放置于指定位置包括:
    在移动机器臂以放置所述目标物体的过程中,采集指定位置的环境图像;
    对所述环境图像中各目标物体平面和在先帧环境图像中对应目标物体平面的边缘点进行拟合,得到目标拟合图形;
    在所述目标拟合图形中,将未在所述在先帧环境图像的目标物体平面中出现的边缘点删除;所述在先帧环境图像是在所述环境图像之前采集的帧图像;
    通过所述目标拟合图形中的剩余边缘点构成目标平面轮廓;
    根据所述目标平面轮廓确定所述目标物体的放置姿态;
    按照所述放置姿态,将所述目标物体放置于所述目标物体平面。
  9. 根据权利要求5所述的方法,其特征在于,所述目标帧图像包含深度信息;所述对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,得到拟合图形包括:
    根据所述深度信息确定所述目标帧图像中各点对应的空间位置;
    基于所述空间位置和平面方程确定各点所在的平面,得到物体平面;
    对所述物体平面的边缘点和所述在先帧图像相应物体平面的边缘点进行拟合,得到拟合图形。
  10. 根据权利要求9所述的方法,其特征在于,所述目标帧图像中包含携带了方向信息的图形码;所述根据所述深度信息确定所述目标帧图像中各点对应的空间位置包括:
    根据所述图形码携带的方向信息确定坐标系基准方向;
    基于所述坐标系基准方向构建空间坐标系;
    在所述空间坐标系中,基于所述深度信息确定所述目标帧图像中各点对应的空间位置。
  11. 根据权利要求9所述的方法,其特征在于,所述对所述物体平面的边缘点和所述在先帧图像相应物体平面的边缘点进行拟合,得到拟合图形包括:
    当所述物体平面为物体的部分区域平面,或被所述目标帧图像中的其它物体遮挡时,则在各所述在先帧图像中确定包含所述物体平面的在先目标帧图像;
    从所述在先目标帧图像中的所述物体平面提取边缘点;
    从所述在先目标帧图像中物体平面提取的边缘点和所述目标帧图像中物体平面的边缘点中,选取目标边缘点;
    对选取的目标边缘点进行拟合,得到拟合图形。
  12. 根据权利要求11所述的方法,其特征在于,所述边缘点为三维边缘点;所述方法还包括:
    将所述三维边缘点映射为二维边缘点;
    确定所述二维边缘点对应的凸多边形,并计算所述凸多边形的面积;
    确定所述二维边缘点的外接图形,并计算所述外接图形的面积;
    当所述凸多边形的面积与所述外接图形的面积的比值达到预设比值时,则确定所述物体 平面为物体的部分区域平面,或被所述目标帧图像中的其它物体遮挡。
  13. 根据权利要求11所述的方法,其特征在于,所述从所述在先目标帧图像中物体平面提取的边缘点和所述目标帧图像中物体平面的边缘点中,选取目标边缘点包括:
    确定所述目标帧图像对应的第一权重和所述在先目标帧图像对应的第二权重;所述第一权重与所述第二权重不相等;
    在所述目标帧图像内物体平面的边缘点中,按照所述第一权重选取第一目标边缘点;以及,在从所述在先目标帧图像内物体平面提取的边缘点中,按照所述第二权重选取第二目标边缘点;
    将所述第一目标边缘点和所述第二目标边缘点作为目标边缘点。
  14. 根据权利要求5至13中任一项所述的方法,其特征在于,所述方法还包括:
    确定所述拟合图形的尺寸;
    当所述尺寸小于预设尺寸时,重新获取对目标环境进行采集所得的目标帧图像;
    当所述尺寸大于或等于预设尺寸时,执行所述在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除的步骤。
  15. 一种平面轮廓检测装置,其特征在于,所述装置包括:
    第一显示模块,用于显示对目标环境进行采集所得的目标帧图像;
    叠加模块,用于将对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的拟合图形,叠加于所述物体平面上展示;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
    删除模块,用于在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
    第二显示模块,用于在所述目标帧图像的物体平面上,显示通过所述拟合图形中的剩余边缘点构成的平面轮廓。
  16. 根据权利要求15所述的装置,其特征在于,所述拟合图形是通过边缘点拟合步骤对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合而成的;所述装置还包括:
    拟合模块,用于根据所述深度信息确定所述目标帧图像中各点对应的空间位置;基于所述空间位置和平面方程确定各点所在的平面,得到所述物体平面;对所述物体平面的边缘点和所述在先帧图像相应物体平面的边缘点进行拟合,得到所述拟合图形。
  17. 一种平面轮廓检测装置,其特征在于,所述装置包括:
    获取模块,用于获取对目标环境进行采集所得的目标帧图像;
    拟合模块,用于对所述目标帧图像中各物体平面的边缘点和在先帧图像中对应物体平面的边缘点进行拟合,得到拟合图形;所述在先帧图像是在所述目标帧图像前对所述目标环境采集所得的帧图像;
    删除模块,用于在所述拟合图形中,将未在所述在先帧图像的物体平面内出现的边缘点删除;
    构建模块,用于将所述拟合图形中的剩余边缘点构成的轮廓识别为平面轮廓。
  18. 根据权利要求17所述的装置,其特征在于,所述方法应用于机器人;所述装置还包括:
    第一规划模块,用于在通过所述目标帧图像构建的各所述平面轮廓中确定机器人移动路 径,或者在所述平面轮廓中选择机器人落脚点;按照所述机器人移动路径或所述机器人落脚点进行移动。
  19. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至14中任一项所述的方法的步骤。
  20. 一种计算机可读存储介质,存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至14中任一项所述的方法的步骤。
PCT/CN2021/114064 2020-09-01 2021-08-23 平面轮廓识别方法、装置、计算机设备和存储介质 WO2022048468A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21863535.7A EP4131162A4 (en) 2020-09-01 2021-08-23 METHOD AND DEVICE FOR DETECTING PLANAR CONTOURS, COMPUTER DEVICE AND STORAGE MEDIUM
US17/956,364 US20230015214A1 (en) 2020-09-01 2022-09-29 Planar contour recognition method and apparatus, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010901647.7A CN112102342B (zh) 2020-09-01 2020-09-01 平面轮廓识别方法、装置、计算机设备和存储介质
CN202010901647.7 2020-09-01

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/956,364 Continuation US20230015214A1 (en) 2020-09-01 2022-09-29 Planar contour recognition method and apparatus, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2022048468A1 true WO2022048468A1 (zh) 2022-03-10

Family

ID=73756970

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/114064 WO2022048468A1 (zh) 2020-09-01 2021-08-23 平面轮廓识别方法、装置、计算机设备和存储介质

Country Status (4)

Country Link
US (1) US20230015214A1 (zh)
EP (1) EP4131162A4 (zh)
CN (1) CN112102342B (zh)
WO (1) WO2022048468A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115719492A (zh) * 2022-11-29 2023-02-28 中国测绘科学研究院 一种面状要素宽窄特征识别方法、装置、设备及可读存储介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112102342B (zh) * 2020-09-01 2023-12-01 腾讯科技(深圳)有限公司 平面轮廓识别方法、装置、计算机设备和存储介质
CN117095019B (zh) * 2023-10-18 2024-05-10 腾讯科技(深圳)有限公司 一种图像分割方法及相关装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101493889A (zh) * 2008-01-23 2009-07-29 华为技术有限公司 对视频对象进行跟踪的方法及装置
CN102006398A (zh) * 2010-10-29 2011-04-06 西安电子科技大学 基于特征直线的船载摄像系统电子稳像方法
CN109255801A (zh) * 2018-08-03 2019-01-22 百度在线网络技术(北京)有限公司 视频中三维物体边缘追踪的方法、装置、设备及存储介质
CN110081862A (zh) * 2019-05-07 2019-08-02 达闼科技(北京)有限公司 一种对象的定位方法、定位装置、电子设备和可存储介质
US20190333242A1 (en) * 2018-08-03 2019-10-31 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for three-dimensional object pose estimation, device and storage medium
CN112102342A (zh) * 2020-09-01 2020-12-18 腾讯科技(深圳)有限公司 平面轮廓识别方法、装置、计算机设备和存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9996974B2 (en) * 2013-08-30 2018-06-12 Qualcomm Incorporated Method and apparatus for representing a physical scene
US9607207B1 (en) * 2014-03-31 2017-03-28 Amazon Technologies, Inc. Plane-fitting edge detection
KR101798041B1 (ko) * 2016-06-29 2017-11-17 성균관대학교산학협력단 3차원 물체 인식 및 자세 추정 장치 및 그 방법
CN110189376B (zh) * 2019-05-06 2022-02-25 达闼科技(北京)有限公司 物体定位方法及物体定位装置
CN110992356B (zh) * 2019-12-17 2024-03-08 深圳辰视智能科技有限公司 目标对象检测方法、装置和计算机设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101493889A (zh) * 2008-01-23 2009-07-29 华为技术有限公司 对视频对象进行跟踪的方法及装置
CN102006398A (zh) * 2010-10-29 2011-04-06 西安电子科技大学 基于特征直线的船载摄像系统电子稳像方法
CN109255801A (zh) * 2018-08-03 2019-01-22 百度在线网络技术(北京)有限公司 视频中三维物体边缘追踪的方法、装置、设备及存储介质
US20190333242A1 (en) * 2018-08-03 2019-10-31 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for three-dimensional object pose estimation, device and storage medium
CN110081862A (zh) * 2019-05-07 2019-08-02 达闼科技(北京)有限公司 一种对象的定位方法、定位装置、电子设备和可存储介质
CN112102342A (zh) * 2020-09-01 2020-12-18 腾讯科技(深圳)有限公司 平面轮廓识别方法、装置、计算机设备和存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4131162A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115719492A (zh) * 2022-11-29 2023-02-28 中国测绘科学研究院 一种面状要素宽窄特征识别方法、装置、设备及可读存储介质
CN115719492B (zh) * 2022-11-29 2023-08-11 中国测绘科学研究院 一种面状要素宽窄特征识别方法、装置、设备及可读存储介质

Also Published As

Publication number Publication date
EP4131162A4 (en) 2023-11-01
CN112102342A (zh) 2020-12-18
EP4131162A1 (en) 2023-02-08
US20230015214A1 (en) 2023-01-19
CN112102342B (zh) 2023-12-01

Similar Documents

Publication Publication Date Title
WO2022048468A1 (zh) 平面轮廓识别方法、装置、计算机设备和存储介质
JP6031554B2 (ja) 単眼カメラに基づく障害物検知方法及び装置
CN106570507B (zh) 单目视频场景三维结构的多视角一致的平面检测解析方法
CN112163251B (zh) 建筑模型单体化方法、装置、存储介质及电子设备
CN103778635B (zh) 用于处理数据的方法和装置
CN110717489A (zh) Osd的文字区域的识别方法、装置及存储介质
CN109410316A (zh) 物体的三维重建的方法、跟踪方法、相关装置及存储介质
US10380796B2 (en) Methods and systems for 3D contour recognition and 3D mesh generation
US20150138193A1 (en) Method and device for panorama-based inter-viewpoint walkthrough, and machine readable medium
CN112927353A (zh) 基于二维目标检测和模型对齐的三维场景重建方法、存储介质及终端
CN110648363A (zh) 相机姿态确定方法、装置、存储介质及电子设备
CN112509126B (zh) 三维物体检测的方法、装置、设备及存储介质
CN111583381A (zh) 游戏资源图的渲染方法、装置及电子设备
CN113223078A (zh) 标志点的匹配方法、装置、计算机设备和存储介质
CA3236016A1 (en) Three-dimensional building model generation based on classification of image elements
CN112396701A (zh) 卫星图像的处理方法、装置、电子设备和计算机存储介质
US20210304411A1 (en) Map construction method, apparatus, storage medium and electronic device
CN111179281A (zh) 人体图像提取方法及人体动作视频提取方法
Saxena et al. 3-d reconstruction from sparse views using monocular vision
CN112002007A (zh) 基于空地影像的模型获取方法及装置、设备、存储介质
US10861174B2 (en) Selective 3D registration
CN115827812A (zh) 重定位方法、装置、设备及其存储介质
US11417063B2 (en) Determining a three-dimensional representation of a scene
CN115683109A (zh) 基于cuda和三维栅格地图的视觉动态障碍物检测方法
CN114549825A (zh) 目标检测方法、装置、电子设备与存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21863535

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021863535

Country of ref document: EP

Effective date: 20221102

NENP Non-entry into the national phase

Ref country code: DE