WO2023015938A1 - Three-dimensional point detection method and apparatus, electronic device, and storage medium - Google Patents

Three-dimensional point detection method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2023015938A1
WO2023015938A1 PCT/CN2022/088149 CN2022088149W WO2023015938A1 WO 2023015938 A1 WO2023015938 A1 WO 2023015938A1 CN 2022088149 W CN2022088149 W CN 2022088149W WO 2023015938 A1 WO2023015938 A1 WO 2023015938A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
target
point
points
image
Prior art date
Application number
PCT/CN2022/088149
Other languages
French (fr)
Chinese (zh)
Inventor
吴思泽
金晟
刘文韬
钱晨
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023015938A1 publication Critical patent/WO2023015938A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the present disclosure relates to the technical field of artificial intelligence, and relates to a three-dimensional point detection method, device, electronic equipment and storage medium.
  • Three-dimensional (Threee-Dimensional, 3D) human body pose estimation refers to estimating the pose of a human target from an image, video or point cloud, and is often used in various industrial fields such as human body reconstruction, human-computer interaction, behavior recognition, and game modeling. In practical application scenarios, there is often a need for multi-person pose estimation. Among them, human body center point detection can be used as a precursor task for multi-person pose estimation.
  • a human body center point detection scheme is provided in the related art, which performs multi-view feature extraction based on 3D space voxelization, and detects the body center point through a convolutional neural network (Convolutional Neural Networks, CNN).
  • CNN convolutional Neural Networks
  • spatial voxelization is to divide the 3D space equidistantly into grids of equal size, and the multi-view image features after voxelization can be used as the input of 3D convolution.
  • Embodiments of the present disclosure at least provide a three-dimensional point detection method, device, electronic device, and storage medium, which improve detection efficiency while improving detection accuracy.
  • an embodiment of the present disclosure provides a method for three-dimensional point detection, the method is executed by an electronic device, and the method includes:
  • For each target object based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object;
  • three-dimensional coordinate information of a target three-dimensional point of the target object is determined.
  • the three-dimensional coordinate information of the candidate three-dimensional points of each target object in the case of determining the three-dimensional coordinate information of the candidate three-dimensional points of each target object based on the target images obtained by shooting multiple target objects under multiple viewing angles, it can be based on the candidate three-dimensional points of each target object
  • the three-dimensional coordinate information of the three-dimensional point and the target image determine the three-dimensional coordinate information of the target three-dimensional point of each target object.
  • the embodiments of the present disclosure can accurately detect the 3D points of each target object by utilizing the projection relationship between the candidate 3D space where the candidate 3D points of the target object are located and the target images under multiple viewing angles. At the same time, for the candidate 3D points The projection operation of the point in the candidate 3D space avoids the voxelization operation of the entire space, which will significantly improve the detection efficiency.
  • the embodiment of the present disclosure also provides a three-dimensional point detection device, the device includes:
  • the acquiring part is configured to acquire a target image obtained by shooting multiple target objects under multiple viewing angles, and a 3D position of a candidate 3D point of each of the multiple target objects determined based on the acquired target images. coordinate information;
  • the detection part is configured to, for each target object, determine a candidate three-dimensional space corresponding to the target object based on three-dimensional coordinate information of a candidate three-dimensional point of the target object; based on the candidate three-dimensional space corresponding to the target object, As well as the target image, determine the three-dimensional coordinate information of the target three-dimensional point of the target object.
  • an embodiment of the present disclosure further provides an electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the steps of the three-dimensional point detection method described in any one of the first aspect and its various implementation modes are executed .
  • the embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and the computer program is executed when the processor runs, as in the first aspect and its various implementation modes The steps of any one of the methods for three-dimensional point detection.
  • the embodiment of the present disclosure also provides a computer program product, the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, the computer executes the computer program described in the first aspect.
  • FIG. 1 shows a flow chart of a method for three-dimensional point detection provided by an embodiment of the present disclosure
  • FIG. 2 shows a schematic diagram of the application of a method for three-dimensional point detection provided by an embodiment of the present disclosure
  • FIG. 3 shows a schematic diagram of a three-dimensional point detection device provided by an embodiment of the present disclosure
  • Fig. 4 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
  • a human point detection scheme is provided in related technologies, which extracts multi-view features based on 3D space voxelization, and detects human body points through CNN.
  • spatial voxelization is to divide the 3D space equidistantly into grids of equal size, and the multi-view image features after voxelization can be used as the input of 3D convolution.
  • the present disclosure provides a method, device, electronic device and storage medium for three-dimensional point detection, which improves detection efficiency while improving the accuracy of point detection.
  • the execution subject of the method for 3D point detection provided in the embodiment of the present disclosure is generally an electronic computer with a certain computing power.
  • the electronic equipment includes, for example: terminal equipment or server or other processing equipment, the terminal equipment can be user equipment (User Equipment, UE), mobile equipment, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), Handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • the method for detecting a three-dimensional point may be implemented in a manner in which a processor invokes computer-readable instructions stored in a memory.
  • FIG. 1 is a flowchart of a method for three-dimensional point detection provided by an embodiment of the present disclosure
  • the method includes steps S101 to S104, wherein:
  • S101 Acquire target images obtained by shooting multiple target objects from multiple viewing angles, and 3D coordinate information of candidate 3D points of each of the multiple target objects determined based on the acquired target images;
  • S102 For each target object, based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object; based on the candidate three-dimensional space corresponding to the target object and the target image, determine the target three-dimensional point of the target object 3D coordinate information.
  • the application scenario of the method may be briefly described next.
  • the method of 3D point detection in the embodiments of the present disclosure can be applied to the relevant application fields of multi-person 3D pose estimation.
  • the 3D pose estimation is performed on multiple pedestrians in front of the self-driving vehicle.
  • Another example is intelligent security.
  • the field estimates the three-dimensional poses of multiple road vehicles, etc., which is not limited in the embodiments of the present disclosure.
  • the embodiments of the present disclosure provide a detection scheme combined with two-dimensional point matching under multiple perspectives and candidate three-dimensional point reconstruction, which not only improves detection accuracy, but also improves detection efficiency.
  • the target image acquired in the embodiment of the present disclosure may be obtained by shooting multiple target objects under multiple viewing angles, and one viewing angle may correspond to one target image.
  • the above-mentioned target images can be obtained by synchronously photographing multiple target objects with multiple cameras installed on the vehicle.
  • the multiple cameras here can be selected in combination with different user needs.
  • the three cameras installed correspondingly at the side and the center position aim at the three target images captured by the pedestrians in front.
  • Each target image may correspond to multiple target objects, for example, it may be a captured target image including two pedestrians.
  • the three-dimensional coordinate information of the candidate three-dimensional point of each target object can be determined based on multiple target images captured under multiple viewing angles.
  • the three-dimensional coordinate information of the target three-dimensional point of each target object can be determined based on the candidate three-dimensional space corresponding to each target object and multiple target images under multiple viewing angles.
  • the candidate 3D point of the target object can be the candidate 3D central point located at the center of the target object, or other specific points that can characterize the target object. It may be to take a specific point on the pedestrian's head, upper body and lower body. The number of specific points can be set based on different application scenarios, and there is no limitation here.
  • the candidate 3D center point is used as the candidate 3D point as an example in the following.
  • the 3D coordinate information about the candidate 3D points can be obtained by first pairing 2D points based on the target image in the 3D space, and then reconstructing based on the paired 2D points. In addition, it may also be determined based on other methods, which are not limited here.
  • one or more candidate 3D points can be constructed for each target object.
  • each candidate 3D point in the multiple candidate 3D points of the target object can determine a spherical range with the 3D coordinate information of the candidate 3D point as the center of the sphere, and then The determination of the candidate three-dimensional space corresponding to the target object can be realized by performing a union operation on the spherical ranges determined by the plurality of candidate three-dimensional points of the target object.
  • the three-dimensional coordinate information of the target three-dimensional point of each target object can be determined based on the projection relationship between the spatial sampling of the corresponding candidate three-dimensional space and the target image under multiple viewing angles. For voxelization, it is only necessary to perform three-dimensional projection on the candidate three-dimensional space where the specified target object is located, that is, more accurate three-dimensional coordinate information of the target three-dimensional point can be determined, and the calculation amount is significantly reduced.
  • the three-dimensional coordinate information of the candidate three-dimensional point of the target object can be determined according to the following steps:
  • Step 1 Extract image feature information of a plurality of two-dimensional points from a plurality of target images, wherein each two-dimensional point in the plurality of two-dimensional points is a pixel located in a corresponding target object;
  • Step 2 Determine paired two-dimensional points belonging to the same target object based on image feature information respectively extracted from a plurality of target images, wherein the paired two-dimensional points come from different target images;
  • Step 3 Determine the 3D coordinate information of the candidate 3D points of the same target object according to the determined 2D coordinate information of the paired 2D points in the respective target images.
  • the image feature information of multiple two-dimensional points can be extracted from each target image based on the image feature extraction method, or the target image can be directly identified by the two-dimensional point recognition network to determine the image of each two-dimensional point
  • the feature information, the image feature information of the two-dimensional point here may represent the relevant feature of the corresponding target object, for example, it may be the position feature of the center point of the person.
  • the determination of paired 2D points in the embodiments of the present disclosure can effectively correlate the corresponding relationship of target objects in 2D space, so that the constructed candidate 3D points can point to the same target object to a certain extent, thus providing multiple Accurate detection of target objects provides good data support.
  • the paired 2D points belonging to the same target object can be determined based on the image feature information respectively extracted from multiple target images, where the paired 2D points come from different target images.
  • the embodiment of the present disclosure may perform image pairing first and then determine the paired 2D points based on the feature update of the 2D points corresponding to the paired images. On the other hand, all 2D The features of the points are updated, and then the paired two-dimensional points are determined, which is not limited in the embodiments of the present disclosure.
  • the candidate 3D points corresponding to the target object can be reconstructed 3D coordinate information.
  • a candidate 3D point can be reconstructed through triangulation, that is, under a multi-camera system, the 2D The 2D coordinates of the point and the camera parameters are used to reconstruct the 3D coordinates corresponding to the two-dimensional point.
  • the method for detecting three-dimensional points can perform image matching first, and then determine paired two-dimensional points.
  • the paired two-dimensional points can be determined through the following steps:
  • Step 1 combining target images in pairs to obtain at least one set of target images
  • Step 2 Based on the image feature information of the two-dimensional points in the plurality of target images, determine whether there are two two-dimensional points matching the image features in each group of target images in at least one group of target images; the two two-dimensional points belong to Different target images in the same set of target images;
  • Step 3 When it is determined that there are two 2D points with matching image features in each group of target images, determine the two 2D points with matching image features as a pair of 2D points belonging to the same target object.
  • multiple target images can be combined in pairs to obtain one or more sets of target images, and then it can be determined whether there are two two-dimensional points that match image features in each set of target images.
  • the matching here can be two two-dimensional points
  • the matching degree of the image feature information of the two-dimensional points is greater than a preset threshold, so that two two-dimensional points whose image features match can be determined as a pair of two-dimensional points belonging to the same target object.
  • the 2D points in the two target images of the group of target images can be combined in pairs to obtain multiple groups of 2D points.
  • the image feature information of two two-dimensional points included in each group of two-dimensional points in multiple groups of two-dimensional points is compared, that is, the following steps can be used to determine whether there are two two-dimensional points that match the image features in each group of target images.
  • Step 1 for each group of two-dimensional points, input the image feature information of two two-dimensional points of the group of two-dimensional points into the feature matching network, and determine whether the image feature information of the two two-dimensional points matches;
  • Step 2 In the case of determining that the image feature information of the two 2D points matches, determine the two 2D points whose image features match as the two 2D points whose image features match exist in the group of target images.
  • a feature matching network can be used to determine whether the image feature information of two two-dimensional points matches.
  • the input value of the feature matching network is a set of two-dimensional points corresponding to a set of target images.
  • the matching operation of the image feature information of two two-dimensional points in each group of two-dimensional points is realized by combining the feature matching network, and the operation is simple.
  • the feature matching network In the process of training the feature matching network, it can be trained based on image samples from multiple perspectives and the labeling information of the same target object, that is, the image corresponding to the two-dimensional point is extracted from the image samples from multiple perspectives
  • the extracted multiple image feature information can be input into the feature matching network to be trained.
  • the network parameter values of the feature matching network can be adjusted until the network output result is consistent with the labeling information, so as to train the feature matching network.
  • the trained feature matching network can be used to determine whether the image feature information of two two-dimensional points matches.
  • the image feature matching of two two-dimensional points indicates that the two two-dimensional points correspond to the same target object.
  • embodiments of the present disclosure may combine the above-mentioned information of other 2D points.
  • the image feature information updates the image feature information of the two-dimensional points, and then inputs the updated image feature information into the feature matching network to determine whether the image feature information of the two two-dimensional points matches.
  • the image feature information of the two-dimensional point can be updated based on the image feature information of other two-dimensional points in other target images, so that the accuracy of the determined updated image feature information is higher, and the matching accuracy is further improved.
  • Step 1 Based on the two-dimensional coordinate information of the two-dimensional point in the corresponding target image and the two-dimensional coordinate information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, determine the difference between the two-dimensional point and other two-dimensional points. epipolar distance between points;
  • Step 2 Based on the image feature information of the two-dimensional point, the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the epipolar distance, the image feature information of the two-dimensional point is updated to obtain Updated image feature information.
  • the image feature information of the two-dimensional point can be based on the image feature information of the two-dimensional point, the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the image feature information of the two-dimensional point from the epipolar line Update and efficiently integrate multi-view features, and the matching accuracy can be significantly improved.
  • the epipolar distance between two 2D points can be determined based on the respective 2D coordinate information of a 2D point and other 2D points, and then based on the epipolar distance and the respective image features of the two 2D points
  • the information realizes the updating of image feature information of two-dimensional points.
  • the epipolar distances corresponding to the cameras under different viewing angles can reflect the relationship between different target points, with two cameras (respectively camera 1 and camera 2) and two target points (point A and point B)
  • point A in the perspective of camera 1 corresponds to a line (polar line) in the perspective of camera 2
  • the distance between the epipolar line and point B in the perspective of camera 2 determines the distance between the two points.
  • the degree of proximity, using the epipolar distance to update the feature of the two-dimensional point can make the content of the updated image feature information richer, which is more conducive to the determination of the subsequent three-dimensional point.
  • the updated image feature information corresponding to the two-dimensional point can be directly selected for matching without updating all the two-dimensional points, which will improve the overall detection efficiency.
  • the method of 3D point detection provided by the embodiment of the present disclosure can perform feature update first, and then determine the paired 2D points, and the paired 2D points can be determined through the following steps:
  • Step 1 For the first target image among the multiple target images, based on the image feature information extracted from the first target image and the image feature information extracted from other target images except the first target image among the multiple target images , updating image feature information of multiple two-dimensional points in the first target image to obtain updated image feature information respectively corresponding to multiple two-dimensional points in the first target image;
  • Step 2 Determine pairs of two-dimensional points belonging to the same target object based on the updated image feature information respectively corresponding to the multiple target images.
  • the image feature information of multiple two-dimensional points in the first target image can be updated based on the image feature information extracted from other target images in the multiple target images except the first target image, and then from the multiple target images Select two target images arbitrarily, and select two corresponding two-dimensional points from the two selected target images, and input the updated image feature information corresponding to the two selected two-dimensional points into the pre-trained In the feature matching network, it is determined whether the selected two 2D points are paired 2D points belonging to the same target object.
  • two target images may be arbitrarily selected from a plurality of target images, and two corresponding two-dimensional points may be respectively selected from the selected two target images.
  • point to use the pre-trained feature matching network to verify the feature matching please refer to the description of the first aspect above.
  • the matching operation corresponding to two 2D points in the two target images can be realized based on the selection operation, and once it is determined that the two 2D points in the two target images are successfully matched, it can be locked to a target object.
  • the amount of computation is significantly reduced.
  • two target images can be selected arbitrarily, and then two corresponding two-dimensional points can be selected. Once the image features of these two two-dimensional points are successfully matched, Then, the candidate 3D points of the corresponding target object can be determined based on the paired 2D points without verifying all the pairing situations, which will improve the overall detection efficiency.
  • the corresponding candidate 3D space in the process of determining the target 3D point of each target object based on the constructed candidate 3D points, can be determined first, and then based on the projection operation from the 3D space to the 2D space, each The determination of the three-dimensional coordinate information of the target three-dimensional point of the target object.
  • the three-dimensional coordinate information of the target three-dimensional point can be determined through the following steps:
  • Step 1 Carry out spatial sampling of the candidate three-dimensional space of the target object, and determine a plurality of sampling points;
  • Step 2 for each sampling point in the plurality of sampling points, based on the three-dimensional coordinate information of the sampling point in the candidate three-dimensional space and the target image, determine the three-dimensional point detection result corresponding to the sampling point;
  • Step 3 Determine the three-dimensional coordinate information of the target three-dimensional point of each target object based on the obtained three-dimensional point detection result.
  • adaptive sampling can be carried out for the candidate 3D space corresponding to each target object.
  • equidistant sampling is performed in the search space, and then based on the 3D coordinate information of the sampling point in the candidate 3D space and multiple target images, determine each In this way, further fine sampling around the reconstructed candidate 3D point can be realized, so that a more accurate target 3D point position can be obtained.
  • the corresponding candidate three-dimensional space can be determined, and the detection of relevant three-dimensional points can be realized based on the sampling of the candidate three-dimensional space.
  • the sampling operation for the candidate three-dimensional space significantly improves the detection s efficiency.
  • the three-dimensional point detection results corresponding to the sampling points can be determined through the following steps:
  • Step 1 For each sampling point among multiple sampling points, based on the corresponding relationship between the three-dimensional coordinate system of the candidate three-dimensional space and the two-dimensional coordinate system of each viewing angle, project the three-dimensional coordinate information to different viewing angles to determine the sampling point Two-dimensional projection point information in multiple target images respectively;
  • Step 2 based on the two-dimensional projection point information of the sampling points in multiple target images, determine the feature information of the sampling points under different viewing angles;
  • Step 3 Determine the 3D point detection result corresponding to the sampling point based on the feature information of the sampling point under different viewing angles.
  • the 3D point detection method provided by the embodiments of the present disclosure can first determine the two-dimensional projection point information of the sampling point in multiple target images, and then determine the sampling point feature information of the sampling point under different viewing angles based on the two-dimensional projection point information .
  • connection relationship of sampling points under different viewing angles can be determined by using the sampling point feature information of sampling points under different viewing angles. Such connection relationship will help to determine more accurate sampling point feature information, and further make all The determined 3D point detection results and accuracy are improved.
  • the information about the two-dimensional projection point can be determined based on the conversion relationship between the three-dimensional coordinate system where the sampling point is located and the two-dimensional coordinate system where the target image is located, that is, the sampling point can be projected onto the target image by using the conversion relationship, thereby Determine the image position and other information of the two-dimensional projection point of the sampling point on the target image.
  • the feature information of the sampling points under different viewing angles can be determined.
  • the feature information of the sampling points determined here can be the feature information of different viewing angles. This is Considering that for the same target object, there is a certain connection relationship between the corresponding sampling points under different perspectives, using this connection relationship can realize the update of the characteristics of the sampling points. In addition, under the same perspective, the corresponding sampling points There is also a certain connection relationship between the points, which can also be used to update the characteristics of the sampling points, so that the determined feature information of the sampling points is more in line with the actual 3D information of the target object.
  • Step 1 extracting image features respectively corresponding to a plurality of target images
  • Step 2 For each of the multiple target images, based on the image position information of the two-dimensional projection points of the sampling points in the multiple target images, extract an image corresponding to the image position information from the image features corresponding to the target image feature;
  • Step 3 The extracted image features corresponding to the image position information are used to determine the feature information of the sampling points under different viewing angles.
  • the characteristic information of the sampling point matching the sampling point can be determined based on the correspondence between the two-dimensional projection point information of the sampling point in multiple target images and the image features, and the operation is simple.
  • the 3D point detection method provided by the embodiment of the present disclosure in order to extract the feature information of the sampling point matching the sampling point, can be based on the image position information of the 2D projection point of the sampling point in multiple target images, from the corresponding target image
  • the image feature corresponding to the image position information is extracted from the image feature, and the extracted image feature is used as the feature information of the sampling point matched with the sampling point.
  • the image features corresponding to the target image can be obtained based on image processing, or extracted based on a trained feature extraction network, or other information that can be extracted to represent the target object, the scene where the target object is located, etc. determined by other methods, which are not limited in this embodiment of the present disclosure.
  • the sampling point feature information of the sampling point can be updated first, and then based on the updated sampling point feature information, the corresponding 3D point detection result of the sampling point can be determined through the following Steps to determine the 3D point detection results corresponding to the sampling points:
  • Step 1 Based on the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points associated with the sampling point, determine the updated sampling point feature information of the sampling point under different viewing angles;
  • Step 2 Based on the updated sampling point feature information corresponding to the sampling point, determine the three-dimensional point detection result corresponding to the sampling point.
  • sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points associated with the sampling point can be used to update the sampling point feature information of the sampling point, and update the sampling point feature information to a certain extent.
  • the above includes the features of other sampling points in one view, and also includes the features of sampling points between different views, making the features of sampling points more accurate, and thus making the determined 3D pose information more accurate.
  • sampling points associated with the sampling point may be sampling points that have a connection relationship with the sampling point.
  • the connection relationship here corresponds to the connection relationship between the sampling points in the same view, and for sampling points under different viewing angles
  • point feature information what can be determined is the connection relationship between the two-dimensional projection points determined for the same sampling point under different views.
  • Step 1 Based on the sampling point feature information of the sampling point under different viewing angles and the first connection relationship between the two-dimensional projection points of the sampling point under different viewing angles, the sampling point feature information of the sampling point under different viewing angles is first performed. update, to obtain the first updated sampling point feature information; and, based on the sampling point feature information of the sampling point under the target perspective and the sampling points belonging to the target perspective and other sampling points that have a second connection relationship with the sampling point The sampling point characteristic information performs a second update on the sampling point characteristic information of the sampling point under the target perspective to obtain the second updated sampling point characteristic information;
  • Step 2 Based on the first updated sampling point feature information and the second updated sampling point feature information, determine the updated sampling point feature information of the sampling point under the target perspective.
  • the first connection relationship between the two-dimensional projection points of the sampling point under different viewing angles is predetermined, based on the first connection relationship, the feature information of the sampling point under one viewing angle can be updated, that is, the first The updated sampling point feature information is fused with the sampling point features of the same sampling point in other views.
  • the sampling point feature information of the sampling point can be updated based on the sampling point feature information of other sampling points that belong to the target perspective and have a second connection relationship with the sampling point, where the second connection relationship can also be predetermined , so that the determined second updated sampling point feature information incorporates the sampling point features of other sampling points in the same view.
  • Combining the first updated sampling point feature information and the second updated sampling point feature information can make the updated sampling point feature information of the determined sampling point under the target view angle more accurate. For updates of sampling points in other perspectives, refer to the above description.
  • Graph Neural Network can be used to update the feature information of the above sampling points.
  • a graph model can be constructed based on the first connection relationship, the second connection relationship, and the feature information of the sampling point, and the feature information of the sample point of the sample point can be continuously updated by performing a convolution operation on the graph model.
  • the updated sampling point feature information corresponding to all the sampling points of the target object can be input into the three-dimensional point detection network, and the corresponding 3D point detection results of the target object.
  • the three-dimensional coordinate information of the sampling point with the highest prediction probability may be determined as the three-dimensional coordinate information of the target three-dimensional point corresponding to the target object.
  • the node V corresponds to the image feature information of the 2D center points at each viewing angle
  • the edge E corresponds to the relationship between the nodes, which can be the epipolar distance between the 2D center points.
  • the image feature information of the 2D center point under different viewing angles can be updated.
  • the graph neural network 201 can be used to update the feature.
  • the feature matching network 202 can be used to determine whether each pair of 2D center points (that is, a side) belongs to the same target object, and after feature updating and feature matching, the lower part of Figure 2 can be obtained. The pairing relationship shown by the solid line.
  • a candidate three-dimensional space can be determined for each target object, such as the spherical three-dimensional space pointed by the dotted line in the lower part of FIG. 2 .
  • the determination of the detection result of the three-dimensional center point corresponding to the relevant sampling point can be realized, and then it can be determined
  • the three-dimensional coordinate information of the target three-dimensional center point of each target object is obtained.
  • the 3D point detection method provided by the embodiment of the present disclosure can further search for the target 3D point of each target object in the candidate 3D space, this can reduce the reconstruction error caused by the inaccurate reconstructed candidate 3D point to a certain extent.
  • the target 3D point of the target object can be determined through the search operation of the candidate 3D space where each target object is located, for example, between the target object A and the target Object B has a pairing error, and when the pairing of target object B and target object C is correct, the search result of pairing error can be verified based on the search result of correct pairing, and the detection accuracy of multiple target objects can be further improved.
  • the approximate position of each target object can be determined, and subsequent multi-person gesture recognition can be realized.
  • other correlations can also be realized. application.
  • the embodiment of the present disclosure also provides a three-dimensional point detection device corresponding to the three-dimensional point detection method, because the problem-solving principle of the device in the embodiment of the present disclosure is the same as the above-mentioned three-dimensional point detection method of the embodiment of the present disclosure Similarly, the implementation of the device can refer to the implementation of the method.
  • FIG. 3 it is a schematic diagram of a three-dimensional point detection device provided by an embodiment of the present disclosure.
  • the device includes: an acquisition part 301 and a detection part 302; wherein,
  • the acquiring part 301 is configured to acquire target images obtained by shooting multiple target objects under multiple viewing angles, and three-dimensional coordinate information of candidate three-dimensional points of each of the multiple target objects determined based on the acquired target images;
  • the detection part 302 is configured to, for each target object, determine the candidate three-dimensional space corresponding to the target object based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object; determine the target object based on the candidate three-dimensional space corresponding to the target object and the target image The three-dimensional coordinate information of the target three-dimensional point.
  • the embodiments of the present disclosure can accurately detect the 3D points of each target object by utilizing the projection relationship between the candidate 3D space where the candidate 3D points of the target object are located and the target images under multiple viewing angles. At the same time, for the candidate 3D points The projection operation of the point in the candidate 3D space avoids the voxelization operation of the entire space, which will significantly improve the detection efficiency.
  • the 3D point includes a 3D center point; the candidate 3D point includes a candidate 3D center point, and the candidate 3D center point of the target object is located at the center of the target object; the target 3D point includes the target 3D center point.
  • the detection part 302 is configured to determine the three-dimensional coordinate information of the target three-dimensional point of the target object based on the candidate three-dimensional space corresponding to the target object and the target image according to the following steps, including:
  • the detection part 302 is configured to determine the 3D point detection result corresponding to the sampling point based on the 3D coordinate information of the sampling point in the candidate 3D space and the target image according to the following steps:
  • the three-dimensional coordinate information is projected to different viewing angles, and the sampling point is determined to be in multiple Two-dimensional projected point information in a target image;
  • the feature information of the sampling points under different viewing angles is determined
  • the 3D point detection results corresponding to the sampling points are determined.
  • the two-dimensional projection point information includes image position information of the two-dimensional projection point;
  • the detection part 302 is configured to follow the steps below based on the two-dimensional projection point information of the sampling points respectively in multiple target images , to determine the feature information of the sampling point under different viewing angles:
  • the extracted image features corresponding to the image position information are used to determine the feature information of the sampling points under different viewing angles.
  • the detection part 302 is configured to determine the three-dimensional point detection result corresponding to the sampling point based on the sampling point feature information of the sampling point under different viewing angles according to the following steps:
  • sampling point feature information of the sampling point under different viewing angles Based on the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points associated with the sampling point, determine the updated sampling point feature information of the sampling point under different viewing angles;
  • a three-dimensional point detection result corresponding to the sampling point is determined.
  • the acquisition part 301 is configured to determine the three-dimensional coordinate information of the candidate three-dimensional points of each target object according to the following steps:
  • each two-dimensional point is a pixel located in a corresponding target object
  • the three-dimensional coordinate information of the candidate three-dimensional points of the same target object is determined according to the determined two-dimensional coordinate information of the paired two-dimensional points in the respective target images.
  • the acquiring part 301 is configured to determine pairs of two-dimensional points belonging to the same target object based on image feature information respectively extracted from multiple target images according to the following steps:
  • the two two-dimensional points Based on the image feature information of the two-dimensional points in the plurality of target images, determine whether there are two two-dimensional points matching the image features in each group of target images; the two two-dimensional points respectively belong to different target images in the same group of target images;
  • the two two-dimensional points with matching image features are determined as a pair of two-dimensional points belonging to the same target object.
  • the acquiring part 301 is configured to determine whether there are two 2D points whose image features match in each group of target images based on the image feature information of multiple target images according to the following steps: point:
  • For each group of target images combining the two-dimensional points in the two target images of the group of target images in pairs to obtain multiple groups of two-dimensional points; based on the image feature information of the two two-dimensional points included in each group of two-dimensional points, Determine whether there are two 2D points in the set of target images that match the image features.
  • the acquisition part 301 is configured to determine whether there are two pairs of image features matching in the group of target images based on the image feature information of two two-dimensional points included in each group of two-dimensional points according to the following steps: two-dimensional points:
  • For each group of two-dimensional points input the image feature information of two two-dimensional points of the group of two-dimensional points into the feature matching network, and determine whether the image feature information of the two two-dimensional points matches;
  • any group of two two-dimensional points whose image features match is determined as two two-dimensional points whose image features match exist in the group of target images.
  • the acquisition part 301 is configured to input the image feature information of two two-dimensional points of the group of two-dimensional points into the feature matching network according to the following steps, and determine the images of the two two-dimensional points Whether the feature information matches:
  • the The image feature information of the two-dimensional point is updated to obtain the updated image feature information
  • the acquiring part 301 is configured to determine pairs of two-dimensional points belonging to the same target object based on image feature information respectively extracted from multiple target images according to the following steps:
  • the second The image feature information of multiple two-dimensional points in a target image is updated to obtain updated image feature information respectively corresponding to multiple two-dimensional points in the first target image;
  • pairs of two-dimensional points belonging to the same target object are determined.
  • the acquisition part 301 is configured to determine the paired two-dimensional points belonging to the same target object based on the image feature information respectively updated in multiple target images according to the following steps:
  • the acquisition part 301 is configured to update the image feature information of the two-dimensional point according to the following steps:
  • the image feature information of the two-dimensional point Based on the image feature information of the two-dimensional point, the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the epipolar distance, the image feature information of the two-dimensional point is updated to obtain the updated Image feature information.
  • FIG. 4 is a schematic structural diagram of the electronic device provided by the embodiment of the present disclosure, including: a processor 401 , a memory 402 , and a bus 403 .
  • the memory 402 stores machine-readable instructions executable by the processor 401 (for example, execution instructions corresponding to the acquisition part 301 and the detection part 302 in the device in FIG. 3 ), and when the electronic device is running, the processor 401 and the memory 402 communicates through the bus 403, and when the machine-readable instructions are executed by the processor 401, the following processing is performed:
  • An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method for three-dimensional point detection described in the above-mentioned method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • An embodiment of the present disclosure also provides a computer program product, the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, the computer executes the method described in the above method embodiment.
  • the steps of the method for three-dimensional point detection refer to the foregoing method embodiments.
  • the above-mentioned computer program product may be realized by hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and the like.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the computer software product is stored in a storage medium, including several
  • the instructions are used to make an electronic device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned computer-readable storage medium may be a tangible device capable of retaining and storing instructions used by an instruction execution device, and may be a volatile storage medium or a nonvolatile storage medium.
  • a computer readable storage medium may be, for example but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the above.
  • Non-exhaustive list of computer-readable storage media include: portable computer disk, hard disk, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), erasable Type programmable read-only memory (Erasable Programmable Read Only Memory, EPROM or flash memory), static random-access memory (Static Random-Access Memory, SRAM), portable compact disk read-only memory (Compact Disk Read Only Memory, CD-ROM) , Digital versatile discs (Digital versatile Disc, DVD), memory sticks, floppy disks, mechanically encoded devices, such as punched cards or raised structures in grooves with instructions stored thereon, and any suitable combination of the foregoing.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • the embodiment of the present disclosure acquires target images obtained by shooting multiple target objects under multiple viewing angles, and 3D coordinate information of candidate 3D points of each of the multiple target objects determined based on the acquired target images ; For each target object, perform the following steps: based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object; based on the candidate three-dimensional space corresponding to the target object, and The target image determines the three-dimensional coordinate information of the target three-dimensional point of the target object. In this way, using the projection relationship between the candidate 3D space where the candidate 3D point of the target object is located and the target images under multiple viewing angles, the 3D point of each target object can be accurately detected. At the same time, for the candidate 3D point in The projection operation in the candidate 3D space avoids the voxelization operation of the entire space, which will significantly improve the detection efficiency.

Abstract

The present invention provides a three-dimensional point detection method and apparatus, an electronic device, and a storage medium. The method comprises: acquiring a target image obtained by photographing a plurality of target objects at a plurality of viewing angles, and three-dimensional coordinate information of a candidate three-dimensional point of each target object in the plurality of target objects determined on the basis of the obtained target image; and for each target object, performing the following steps: determining a candidate three-dimensional space corresponding to the target object on the basis of the three-dimensional coordinate information of the candidate three-dimensional point of the target object; and determining three-dimensional coordinate information of a target three-dimensional point of the target object on the basis of the candidate three-dimensional space corresponding to the target object and the target image.

Description

三维点检测的方法、装置、电子设备及存储介质Method, device, electronic device and storage medium for three-dimensional point detection
相关申请的交叉引用Cross References to Related Applications
本公开基于申请号为202110929512.6、申请日为2021年08月13日、申请名称为“三维点检测的方法、装置、电子设备及存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。This disclosure is based on the Chinese patent application with the application number 202110929512.6, the application date is August 13, 2021, and the application name is "Method, device, electronic equipment and storage medium for three-dimensional point detection", and requires the priority of the Chinese patent application Right, the entire content of this Chinese patent application is hereby incorporated into this disclosure as a reference.
技术领域technical field
本公开涉及人工智能技术领域,涉及一种三维点检测的方法、装置、电子设备及存储介质。The present disclosure relates to the technical field of artificial intelligence, and relates to a three-dimensional point detection method, device, electronic equipment and storage medium.
背景技术Background technique
三维(Three-Dimensional,3D)人体姿态估计是指从图像、视频或点云中估计人物目标的姿态,常用于人体重建、人机交互、行为识别、游戏建模等各个工业领域。在实际应用场景中,经常会出现多人姿态估计的需求。其中,人体中心点检测可作为多人姿态估计的一个前置任务。Three-dimensional (Three-Dimensional, 3D) human body pose estimation refers to estimating the pose of a human target from an image, video or point cloud, and is often used in various industrial fields such as human body reconstruction, human-computer interaction, behavior recognition, and game modeling. In practical application scenarios, there is often a need for multi-person pose estimation. Among them, human body center point detection can be used as a precursor task for multi-person pose estimation.
相关技术中提供了一种人体中心点检测方案,该基于3D空间体素化进行多视角特征提取,并通过卷积神经网络(Convolutional Neural Networks,CNN)检测人体中心点。其中,空间体素化是将3D空间等距地划分为等大小的网格,体素化后的多视角图像特征可以作为3D卷积的输入。A human body center point detection scheme is provided in the related art, which performs multi-view feature extraction based on 3D space voxelization, and detects the body center point through a convolutional neural network (Convolutional Neural Networks, CNN). Among them, spatial voxelization is to divide the 3D space equidistantly into grids of equal size, and the multi-view image features after voxelization can be used as the input of 3D convolution.
然而,在进行体素化的过程中,无法有效的区分不同的目标,将导致所检测到的多个人体中心点的准确性较差,与此同时,由于上述体素化是针对整个空间进行的,这将耗费大量的计算量。However, in the process of voxelization, different targets cannot be effectively distinguished, which will lead to poor accuracy of the detected multiple human center points. At the same time, since the above voxelization is carried out for the entire space Yes, this will consume a lot of computation.
发明内容Contents of the invention
本公开实施例至少提供一种三维点检测的方法、装置、电子设备及存储介质,在提升检测准确性的同时,提升检测效率。Embodiments of the present disclosure at least provide a three-dimensional point detection method, device, electronic device, and storage medium, which improve detection efficiency while improving detection accuracy.
第一方面,本公开实施例提供了一种三维点检测的方法,所述方法由电子设备执行,所述方法包括:In a first aspect, an embodiment of the present disclosure provides a method for three-dimensional point detection, the method is executed by an electronic device, and the method includes:
获取多个视角下对多个目标对象进行拍摄得到的目标图像,以及基于获取的所述目标图像确定的所述多个目标对象中每个目标对象的候选三维点的三维坐标信息;Acquiring target images obtained by shooting multiple target objects under multiple viewing angles, and three-dimensional coordinate information of candidate three-dimensional points of each of the multiple target objects determined based on the acquired target images;
针对所述每个目标对象,基于所述目标对象的候选三维点的三维坐标信息,确定所述目标对象对应的候选三维空间;For each target object, based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object;
基于所述目标对象对应的候选三维空间、以及所述目标图像,确定所述目标对象的目标三维点的三维坐标信息。Based on the candidate three-dimensional space corresponding to the target object and the target image, three-dimensional coordinate information of a target three-dimensional point of the target object is determined.
采用上述三维点检测的方法,在基于多个视角下对多个目标对象进行拍摄得到的目标图像确定每个目标对象的候选三维点的三维坐标信息的情况下,能够基于每个目标对象的候选三维点的三维坐标信息、以及目标图像,确定每个目标对象的目标三维点的三维坐标信息。Using the above-mentioned three-dimensional point detection method, in the case of determining the three-dimensional coordinate information of the candidate three-dimensional points of each target object based on the target images obtained by shooting multiple target objects under multiple viewing angles, it can be based on the candidate three-dimensional points of each target object The three-dimensional coordinate information of the three-dimensional point and the target image determine the three-dimensional coordinate information of the target three-dimensional point of each target object.
本公开实施例利用目标对象的候选三维点所在候选三维空间与多个视角下的目标图像之间的投影关系,可以准确的对每个目标对象的三维点进行检测,与此同时,针对候选三维点在候选三维空间内的投影操作避免了整个空间的体素化操作,这将显著提升检测的效率。The embodiments of the present disclosure can accurately detect the 3D points of each target object by utilizing the projection relationship between the candidate 3D space where the candidate 3D points of the target object are located and the target images under multiple viewing angles. At the same time, for the candidate 3D points The projection operation of the point in the candidate 3D space avoids the voxelization operation of the entire space, which will significantly improve the detection efficiency.
第二方面,本公开实施例还提供了一种三维点检测的装置,所述装置包括:In the second aspect, the embodiment of the present disclosure also provides a three-dimensional point detection device, the device includes:
获取部分,被配置为获取多个视角下对多个目标对象进行拍摄得到的目标图像,以 及基于获取的所述目标图像确定的所述多个目标对象中每个目标对象的候选三维点的三维坐标信息;The acquiring part is configured to acquire a target image obtained by shooting multiple target objects under multiple viewing angles, and a 3D position of a candidate 3D point of each of the multiple target objects determined based on the acquired target images. coordinate information;
检测部分,被配置为针对所述每个目标对象,基于所述目标对象的候选三维点的三维坐标信息,确定所述目标对象对应的候选三维空间;基于所述目标对象对应的候选三维空间、以及所述目标图像,确定所述目标对象的目标三维点的三维坐标信息。The detection part is configured to, for each target object, determine a candidate three-dimensional space corresponding to the target object based on three-dimensional coordinate information of a candidate three-dimensional point of the target object; based on the candidate three-dimensional space corresponding to the target object, As well as the target image, determine the three-dimensional coordinate information of the target three-dimensional point of the target object.
第三方面,本公开实施例还提供了一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,在电子设备运行的情况下,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如第一方面及其各种实施方式任一所述的三维点检测的方法的步骤。In a third aspect, an embodiment of the present disclosure further provides an electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the steps of the three-dimensional point detection method described in any one of the first aspect and its various implementation modes are executed .
第四方面,本公开实施例还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如第一方面及其各种实施方式任一所述的三维点检测的方法的步骤。In the fourth aspect, the embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and the computer program is executed when the processor runs, as in the first aspect and its various implementation modes The steps of any one of the methods for three-dimensional point detection.
第五方面,本公开实施例还提供了一种计算机程序产品,计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在计算机上运行的情况下,使得所述计算机执行如第一方面及其各种实施方式任一所述的三维点检测的方法的步骤。In the fifth aspect, the embodiment of the present disclosure also provides a computer program product, the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, the computer executes the computer program described in the first aspect. The steps of the three-dimensional point detection method described in any one of its various embodiments.
关于上述三维点检测的装置、电子设备、及计算机可读存储介质的效果描述参见上述三维点检测的方法的说明。For the effect description of the above-mentioned three-dimensional point detection device, electronic equipment, and computer-readable storage medium, refer to the description of the above-mentioned three-dimensional point detection method.
为使本公开实施例的上述目的、特征和优点能更明显易懂,下文特举实施例,并配合所附附图,作详细说明如下。In order to make the above objects, features and advantages of the embodiments of the present disclosure more comprehensible, the following specific embodiments are described in detail in conjunction with the accompanying drawings.
附图说明Description of drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开实施例的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它相关的附图。In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the accompanying drawings used in the embodiments. The accompanying drawings here are incorporated into the specification and constitute a part of the specification. The drawings show embodiments consistent with the present disclosure, and are used together with the specification to illustrate the technical solutions of the embodiments of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. For those skilled in the art, they can also make Other related drawings are derived from these drawings.
图1示出了本公开实施例所提供的一种三维点检测的方法的流程图;FIG. 1 shows a flow chart of a method for three-dimensional point detection provided by an embodiment of the present disclosure;
图2示出了本公开实施例所提供的一种三维点检测的方法的应用示意图;FIG. 2 shows a schematic diagram of the application of a method for three-dimensional point detection provided by an embodiment of the present disclosure;
图3示出了本公开实施例所提供的一种三维点检测的装置的示意图;FIG. 3 shows a schematic diagram of a three-dimensional point detection device provided by an embodiment of the present disclosure;
图4示出了本公开实施例所提供的一种电子设备的示意图。Fig. 4 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其它实施例,都属于本公开实施例保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are The present disclosure discloses some embodiments, but not all embodiments. The components of the disclosed embodiments generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the claimed disclosure, but represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without making creative efforts belong to the protection scope of the embodiments of the present disclosure.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.
本文中术语“和/或”,是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括 A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" in this article describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists independently. Condition. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.
经研究发现,相关技术中提供了一种人体点检测方案,该基于3D空间体素化进行多视角特征提取,并通过CNN检测人体点。其中,空间体素化是将3D空间等距地划分为等大小的网格,体素化后的多视角图像特征可以作为3D卷积的输入。After research, it is found that a human point detection scheme is provided in related technologies, which extracts multi-view features based on 3D space voxelization, and detects human body points through CNN. Among them, spatial voxelization is to divide the 3D space equidistantly into grids of equal size, and the multi-view image features after voxelization can be used as the input of 3D convolution.
然而,在进行体素化的过程中,无法有效的区分不同的目标,将导致所检测到的多个点的准确性较差,与此同时,由于上述体素化是针对整个空间进行的,这将耗费大量的计算量。However, in the process of voxelization, different targets cannot be effectively distinguished, which will lead to poor accuracy of the detected points. At the same time, since the above voxelization is carried out for the entire space, This will consume a lot of computation.
基于上述研究,本公开提供了一种三维点检测的方法、装置、电子设备及存储介质,在提升点检测准确性的同时,提升检测效率。Based on the above research, the present disclosure provides a method, device, electronic device and storage medium for three-dimensional point detection, which improves detection efficiency while improving the accuracy of point detection.
为便于对本实施例进行理解,首先对本公开实施例所公开的一种三维点检测的方法进行详细介绍,本公开实施例所提供的三维点检测的方法的执行主体一般为具有一定计算能力的电子设备,该电子设备例如包括:终端设备或服务器或其它处理设备,终端设备可以为用户设备(User Equipment,UE)、移动设备、蜂窝电话、无绳电话、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等。在一些可能的实现方式中,该三维点检测的方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In order to facilitate the understanding of this embodiment, a method for 3D point detection disclosed in the embodiment of the present disclosure is first introduced in detail. The execution subject of the method for 3D point detection provided in the embodiment of the present disclosure is generally an electronic computer with a certain computing power. equipment, the electronic equipment includes, for example: terminal equipment or server or other processing equipment, the terminal equipment can be user equipment (User Equipment, UE), mobile equipment, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), Handheld devices, computing devices, in-vehicle devices, wearable devices, etc. In some possible implementation manners, the method for detecting a three-dimensional point may be implemented in a manner in which a processor invokes computer-readable instructions stored in a memory.
参见图1所示,为本公开实施例提供的三维点检测的方法的流程图,方法包括步骤S101至S104,其中:Referring to FIG. 1, which is a flowchart of a method for three-dimensional point detection provided by an embodiment of the present disclosure, the method includes steps S101 to S104, wherein:
S101:获取多个视角下对多个目标对象进行拍摄得到的目标图像,以及基于获取的目标图像确定的多个目标对象中每个目标对象的候选三维点的三维坐标信息;S101: Acquire target images obtained by shooting multiple target objects from multiple viewing angles, and 3D coordinate information of candidate 3D points of each of the multiple target objects determined based on the acquired target images;
S102:针对每个目标对象,基于目标对象的候选三维点的三维坐标信息,确定目标对象对应的候选三维空间;基于目标对象对应的候选三维空间、以及目标图像,确定目标对象的目标三维点的三维坐标信息。S102: For each target object, based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object; based on the candidate three-dimensional space corresponding to the target object and the target image, determine the target three-dimensional point of the target object 3D coordinate information.
为了便于理解本公开实施例提供的三维点检测的方法,接下来首先可以对该方法的应用场景进行简单描述。本公开实施例中的三维点检测的方法可以应用于多人三维姿态估计的相关应用领域中,例如,自动驾驶领域中对自动驾驶车辆前方的多个行人进行三维姿态估计,再如,智能安防领域对多个道路车辆的三维姿态进行估计等,本公开实施例对此不做限制。接下来多以自动驾驶领域进行示例说明。In order to facilitate the understanding of the three-dimensional point detection method provided by the embodiment of the present disclosure, the application scenario of the method may be briefly described next. The method of 3D point detection in the embodiments of the present disclosure can be applied to the relevant application fields of multi-person 3D pose estimation. For example, in the field of automatic driving, the 3D pose estimation is performed on multiple pedestrians in front of the self-driving vehicle. Another example is intelligent security. The field estimates the three-dimensional poses of multiple road vehicles, etc., which is not limited in the embodiments of the present disclosure. Next, we will give more examples in the field of autonomous driving.
考虑到相关技术中结合体素化和CNN网络确定多目标中心点检测的方案中,由于在进行体素化的过程中,无法有效的区分不同的目标,将导致所检测到的多个点的准确性较差,与此同时,由于体素化是针对整个空间进行的,这将耗费大量的计算量。除此之外,即使采用其它诸如结合极线匹配和三角化重建目标点的方案,也由于受到极线匹配的影响而导致检测准确性较差。Considering that in the scheme of combining voxelization and CNN network to determine multi-target central point detection in related technologies, since different targets cannot be effectively distinguished during the process of voxelization, it will result in the detection of multiple points. The accuracy is poor, and at the same time, it will be computationally expensive since the voxelization is done for the entire space. In addition, even if other schemes such as combining epipolar line matching and triangular reconstruction of target points are used, the detection accuracy is poor due to the influence of epipolar line matching.
正是为了解决上述问题,本公开实施例才提供了一种结合多视角下二维点匹配以及候选三维点重建所实现的检测方案,在提升检测准确性的同时,还提高了检测效率。Just to solve the above problems, the embodiments of the present disclosure provide a detection scheme combined with two-dimensional point matching under multiple perspectives and candidate three-dimensional point reconstruction, which not only improves detection accuracy, but also improves detection efficiency.
其中,本公开实施例所获取的目标图像可以是在多个视角下针对多个目标对象拍摄得到的,一个视角可以对应一个目标图像。在自动驾驶领域中,上述目标图像可以是安装在车辆的多个摄像头分别针对多个目标对象进行同步拍摄得到的,这里的多个摄像头可以是结合不同的用户需求来选取,例如可以是车头两侧及中心位置处对应安装的三个摄像头针对前方行人抓拍的三个目标图像。每个目标图像可以对应有多个目标对象,例如,可以是拍摄到的包括两个行人的目标图像。Wherein, the target image acquired in the embodiment of the present disclosure may be obtained by shooting multiple target objects under multiple viewing angles, and one viewing angle may correspond to one target image. In the field of automatic driving, the above-mentioned target images can be obtained by synchronously photographing multiple target objects with multiple cameras installed on the vehicle. The multiple cameras here can be selected in combination with different user needs. The three cameras installed correspondingly at the side and the center position aim at the three target images captured by the pedestrians in front. Each target image may correspond to multiple target objects, for example, it may be a captured target image including two pedestrians.
本公开实施例中,基于多个视角下拍摄得到的多个目标图像可以确定每个目标对象的候选三维点的三维坐标信息,在基于三维坐标信息确定每个目标对象对应的候选三维 空间的情况下,可以基于每个目标对象对应的候选三维空间、以及多个视角下的多个目标图像,确定每个目标对象的目标三维点的三维坐标信息。In the embodiment of the present disclosure, the three-dimensional coordinate information of the candidate three-dimensional point of each target object can be determined based on multiple target images captured under multiple viewing angles. In the case of determining the candidate three-dimensional space corresponding to each target object based on the three-dimensional coordinate information In this case, the three-dimensional coordinate information of the target three-dimensional point of each target object can be determined based on the candidate three-dimensional space corresponding to each target object and multiple target images under multiple viewing angles.
这里,有关目标对象的候选三维点可以是位于目标对象的中心位置的候选三维中心点,还可以是其它能够表征目标对象的特定点,例如,可以仅在人物的头部取一个特定点,也可以是在行人的头部、上半身和下半身各取一个特定点,有关特定点的数量可以基于不同的应用场景来设置,在此不做限制。为了便于进行说明,接下来多以候选三维中心点作为候选三维点进行示例。Here, the candidate 3D point of the target object can be the candidate 3D central point located at the center of the target object, or other specific points that can characterize the target object. It may be to take a specific point on the pedestrian's head, upper body and lower body. The number of specific points can be set based on different application scenarios, and there is no limitation here. For the convenience of description, the candidate 3D center point is used as the candidate 3D point as an example in the following.
有关候选三维点的三维坐标信息可以是先基于三维空间内的目标图像进行二维点配对,而后基于成对的二维点重建得到的。除此之外,还可以是基于其它方法确定的,在此不做限制。The 3D coordinate information about the candidate 3D points can be obtained by first pairing 2D points based on the target image in the 3D space, and then reconstructing based on the paired 2D points. In addition, it may also be determined based on other methods, which are not limited here.
这里,针对每个目标对象可以构建出一个或多个候选三维点。以一个目标对象可以构建出多个候选三维点为例,基于该目标对象的多个候选三维点中每个候选三维点可以确定一个以候选三维点的三维坐标信息为球心的球形范围,而后将这一目标对象的多个候选三维点确定的球形范围进行并集操作,可以实现对应这一目标对象的候选三维空间的确定。Here, one or more candidate 3D points can be constructed for each target object. Taking a target object that can construct multiple candidate 3D points as an example, each candidate 3D point in the multiple candidate 3D points of the target object can determine a spherical range with the 3D coordinate information of the candidate 3D point as the center of the sphere, and then The determination of the candidate three-dimensional space corresponding to the target object can be realized by performing a union operation on the spherical ranges determined by the plurality of candidate three-dimensional points of the target object.
有关每个目标对象的目标三维点的三维坐标信息可以是基于各自对应的候选三维空间的空间采样与多个视角下的目标图像之间的投影关系来确定的,这里,避免了对整个空间体素化,仅需对指定的目标对象所在候选三维空间进行三维投影,即可以确定出更为准确的目标三维点的三维坐标信息,计算量显著下降。The three-dimensional coordinate information of the target three-dimensional point of each target object can be determined based on the projection relationship between the spatial sampling of the corresponding candidate three-dimensional space and the target image under multiple viewing angles. For voxelization, it is only necessary to perform three-dimensional projection on the candidate three-dimensional space where the specified target object is located, that is, more accurate three-dimensional coordinate information of the target three-dimensional point can be determined, and the calculation amount is significantly reduced.
考虑到目标对象的候选三维点的三维坐标信息的确定对候选进行三维点检测的关键作用,接下来对确定候选三维点的三维坐标信息的过程进行说明。Considering that the determination of the 3D coordinate information of the candidate 3D point of the target object plays a key role in the candidate 3D point detection, the process of determining the 3D coordinate information of the candidate 3D point is described next.
本公开实施例中,可以按照如下步骤确定目标对象的候选三维点的三维坐标信息:In the embodiment of the present disclosure, the three-dimensional coordinate information of the candidate three-dimensional point of the target object can be determined according to the following steps:
步骤一、从多个目标图像中分别提取多个二维点的图像特征信息,其中,多个二维点中的每个二维点是位于对应的目标对象中的像素点; Step 1. Extract image feature information of a plurality of two-dimensional points from a plurality of target images, wherein each two-dimensional point in the plurality of two-dimensional points is a pixel located in a corresponding target object;
步骤二、基于从多个目标图像中分别提取的图像特征信息,确定属于同一目标对象的成对的二维点,其中,成对的二维点来自不同的目标图像;Step 2. Determine paired two-dimensional points belonging to the same target object based on image feature information respectively extracted from a plurality of target images, wherein the paired two-dimensional points come from different target images;
步骤三、根据确定的成对的二维点在各自目标图像中的二维坐标信息,确定同一目标对象的候选三维点的三维坐标信息。Step 3: Determine the 3D coordinate information of the candidate 3D points of the same target object according to the determined 2D coordinate information of the paired 2D points in the respective target images.
这里,可以基于图像特征提取方法从每个目标图像中提取多个二维点的图像特征信息,也可以直接利用二维点识别网络对目标图像进行识别,以确定出每个二维点的图像特征信息,这里的二维点的图像特征信息可以表示的是对应目标对象的相关特征,例如,可以是人物中心点的位置特征。Here, the image feature information of multiple two-dimensional points can be extracted from each target image based on the image feature extraction method, or the target image can be directly identified by the two-dimensional point recognition network to determine the image of each two-dimensional point The feature information, the image feature information of the two-dimensional point here may represent the relevant feature of the corresponding target object, for example, it may be the position feature of the center point of the person.
本公开实施例有关成对的二维点的确定可以有效的关联目标对象在二维空间中的对应关系,这样所构建出的候选三维点一定程度上可以指向的是同一目标对象,从而为多目标对象的准确检测提供很好的数据支撑。The determination of paired 2D points in the embodiments of the present disclosure can effectively correlate the corresponding relationship of target objects in 2D space, so that the constructed candidate 3D points can point to the same target object to a certain extent, thus providing multiple Accurate detection of target objects provides good data support.
本公开实施例中,基于多个目标图像中分别提取的图像特征信息可以确定属于同一目标对象的成对的二维点,这里的成对的二维点来自于不同的目标图像。In the embodiment of the present disclosure, the paired 2D points belonging to the same target object can be determined based on the image feature information respectively extracted from multiple target images, where the paired 2D points come from different target images.
在一些实施例中,本公开实施例一方面可以是先进行图像配对再基于配对图像所对应的二维点的特征更新进行成对的二维点的确定,另一方面还可以对所有二维点的特征进行更新,而后再进行成对的二维点的确定,本公开实施例对此不做限制。In some embodiments, the embodiment of the present disclosure may perform image pairing first and then determine the paired 2D points based on the feature update of the 2D points corresponding to the paired images. On the other hand, all 2D The features of the points are updated, and then the paired two-dimensional points are determined, which is not limited in the embodiments of the present disclosure.
不管采用哪种配对方式,由于成对的二维点属于同一目标对象,这样,基于任意成对二维点在各自目标图像中的二维坐标信息,可以重建出对应目标对象的候选三维点的三维坐标信息。在一些实施例中,针对每一对被判断为同一个目标对象的二维点,都可通过三角化重建出一个候选三维点,也即,在多相机系统下,通过多个视角的二维点的 2D坐标和相机参数,重建出二维点对应的3D坐标。No matter which pairing method is used, since the paired 2D points belong to the same target object, based on the 2D coordinate information of any pair of 2D points in their respective target images, the candidate 3D points corresponding to the target object can be reconstructed 3D coordinate information. In some embodiments, for each pair of 2D points judged to be the same target object, a candidate 3D point can be reconstructed through triangulation, that is, under a multi-camera system, the 2D The 2D coordinates of the point and the camera parameters are used to reconstruct the 3D coordinates corresponding to the two-dimensional point.
考虑到成对的二维点的确定对于重建目标三维点的关键作用,接下来可以通过如下两个方面进行说明。Considering that the determination of paired 2D points plays a key role in reconstructing the target 3D point, it can be explained in the following two aspects.
第一方面:本公开实施例提供的三维点检测的方法可以先进行图像匹配,而后再进行成对的二维点的确定,可以通过如下步骤来确定成对的二维点:First aspect: The method for detecting three-dimensional points provided by the embodiments of the present disclosure can perform image matching first, and then determine paired two-dimensional points. The paired two-dimensional points can be determined through the following steps:
步骤一、将目标图像进行两两组合,得到至少一组目标图像; Step 1, combining target images in pairs to obtain at least one set of target images;
步骤二、基于多个目标图像中的二维点的图像特征信息,确定至少一组目标图像中的每组目标图像中是否存在图像特征匹配的两个二维点;两个二维点分别属于同一组目标图像中的不同目标图像;Step 2. Based on the image feature information of the two-dimensional points in the plurality of target images, determine whether there are two two-dimensional points matching the image features in each group of target images in at least one group of target images; the two two-dimensional points belong to Different target images in the same set of target images;
步骤三、在确定每组目标图像中存在图像特征匹配的两个二维点的情况下,将图像特征匹配的两个二维点确定为属于同一目标对象的成对的二维点。Step 3: When it is determined that there are two 2D points with matching image features in each group of target images, determine the two 2D points with matching image features as a pair of 2D points belonging to the same target object.
这里,可以将多个目标图像进行两两组合,得到一组或多组目标图像,而后可以确定每组目标图像中是否存在图像特征匹配的两个二维点,这里的匹配可以是两个二维点的图像特征信息的匹配度大于预设阈值,这样,可以将图像特征匹配的两个二维点确定为属于同一目标对象的成对的二维点。Here, multiple target images can be combined in pairs to obtain one or more sets of target images, and then it can be determined whether there are two two-dimensional points that match image features in each set of target images. The matching here can be two two-dimensional points The matching degree of the image feature information of the two-dimensional points is greater than a preset threshold, so that two two-dimensional points whose image features match can be determined as a pair of two-dimensional points belonging to the same target object.
这样,基于图像分组和图像特征匹配实现了有关属于同一目标对象的成对的二维点的确定,使得所确定的成对的二维点对应的是同一目标对象的可能性大为提升,进而提升了后续进行三维点检测的准确性。In this way, based on image grouping and image feature matching, the determination of paired two-dimensional points belonging to the same target object is realized, so that the possibility that the determined paired two-dimensional points correspond to the same target object is greatly improved, and then The accuracy of subsequent 3D point detection is improved.
考虑到每个目标图像中存在多个二维点,这里,可以针对每组目标图像先进行该组目标图像的两个目标图像中的二维点两两组合,得到多组二维点,将多组二维点中的每组二维点包括的两个二维点的图像特征信息进行特征比对,即可以通过如下步骤来确定每组目标图像中是否存在图像特征匹配的两个二维点:Considering that there are multiple 2D points in each target image, here, for each group of target images, the 2D points in the two target images of the group of target images can be combined in pairs to obtain multiple groups of 2D points. The image feature information of two two-dimensional points included in each group of two-dimensional points in multiple groups of two-dimensional points is compared, that is, the following steps can be used to determine whether there are two two-dimensional points that match the image features in each group of target images. point:
步骤一、针对每组二维点,将该组二维点的两个二维点的图像特征信息输入到特征匹配网络中,确定两个二维点的图像特征信息是否匹配; Step 1, for each group of two-dimensional points, input the image feature information of two two-dimensional points of the group of two-dimensional points into the feature matching network, and determine whether the image feature information of the two two-dimensional points matches;
步骤二、在确定两个二维点的图像特征信息匹配的情况下,将图像特征匹配的两个二维点确定为该组目标图像中存在图像特征匹配的两个二维点。Step 2: In the case of determining that the image feature information of the two 2D points matches, determine the two 2D points whose image features match as the two 2D points whose image features match exist in the group of target images.
这里,可以利用特征匹配网络确定两个二维点的图像特征信息是否匹配。其中,上述特征匹配网络的输入值是一组目标图像对应的一组二维点。Here, a feature matching network can be used to determine whether the image feature information of two two-dimensional points matches. Wherein, the input value of the feature matching network is a set of two-dimensional points corresponding to a set of target images.
这样,结合特征匹配网络实现了每组二维点中的两个二维点的图像特征信息的匹配操作,操作简单。In this way, the matching operation of the image feature information of two two-dimensional points in each group of two-dimensional points is realized by combining the feature matching network, and the operation is simple.
在训练特征匹配网络的过程中,可以是基于多个视角的图像样本、针对同一目标对象的标注信息训练得到的,也即,在从多个视角的图像样本中提取出对应二维点的图像特征信息的情况下,可以将提取的多个图像特征信息输入到待训练的特征匹配网络中。在网络输出结果与标注信息不一致的情况下,可以对特征匹配网络进行网络参数值的调整,直至网络输出结果与标注信息一致,从而训练得到特征匹配网络。In the process of training the feature matching network, it can be trained based on image samples from multiple perspectives and the labeling information of the same target object, that is, the image corresponding to the two-dimensional point is extracted from the image samples from multiple perspectives In the case of feature information, the extracted multiple image feature information can be input into the feature matching network to be trained. When the network output result is inconsistent with the labeling information, the network parameter values of the feature matching network can be adjusted until the network output result is consistent with the labeling information, so as to train the feature matching network.
利用训练好的特征匹配网络可以确定两个二维点的图像特征信息是否匹配,两个二维点的图像特征匹配说明这两个二维点对应的是同一个目标对象。The trained feature matching network can be used to determine whether the image feature information of two two-dimensional points matches. The image feature matching of two two-dimensional points indicates that the two two-dimensional points correspond to the same target object.
考虑到在实际应用中,与一个二维点所在目标图像不同的其它目标图像中其它二维点的图像特征信息对于这一二维点的影响,本公开实施例可以结合上述其它二维点的图像特征信息对二维点的图像特征信息进行更新,继而将更新后的图像特征信息输入到特征匹配网络中,确定两个二维点的图像特征信息是否匹配。Considering the impact of image feature information of other 2D points in other target images different from the target image where a 2D point is located on this 2D point in practical applications, embodiments of the present disclosure may combine the above-mentioned information of other 2D points The image feature information updates the image feature information of the two-dimensional points, and then inputs the updated image feature information into the feature matching network to determine whether the image feature information of the two two-dimensional points matches.
这样,基于其它目标图像中其它二维点的图像特征信息可以对二维点的图像特征信息进行更新,使得所确定的更新后的图像特征信息的准确性更高,进一步提升匹配的准确性。In this way, the image feature information of the two-dimensional point can be updated based on the image feature information of other two-dimensional points in other target images, so that the accuracy of the determined updated image feature information is higher, and the matching accuracy is further improved.
本公开实施例可以按照如下方式实现二维点的图像特征信息的更新:The embodiments of the present disclosure can implement updating of image feature information of two-dimensional points in the following manner:
步骤一、基于二维点在对应目标图像中的二维坐标信息、以及与二维点所在目标图像不同的其它目标图像中其它二维点的二维坐标信息,确定二维点与其它二维点之间的极线距离; Step 1. Based on the two-dimensional coordinate information of the two-dimensional point in the corresponding target image and the two-dimensional coordinate information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, determine the difference between the two-dimensional point and other two-dimensional points. epipolar distance between points;
步骤二、基于二维点的图像特征信息、与二维点所在目标图像不同的其它目标图像中其它二维点的图像特征信息、以及极线距离对二维点的图像特征信息进行更新,得到更新后的图像特征信息。Step 2. Based on the image feature information of the two-dimensional point, the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the epipolar distance, the image feature information of the two-dimensional point is updated to obtain Updated image feature information.
这样,可以基于二维点的图像特征信息、与该二维点所在目标图像不同的其它目标图像中其它二维点的图像特征信息、以及所述极线距离对该二维点的图像特征信息进行更新,高效的融合多视角特征,匹配精度得以显著提升。In this way, the image feature information of the two-dimensional point can be based on the image feature information of the two-dimensional point, the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the image feature information of the two-dimensional point from the epipolar line Update and efficiently integrate multi-view features, and the matching accuracy can be significantly improved.
这里,可以基于一个二维点与其它二维点各自的二维坐标信息来确定两个二维点之间的极线距离,进而基于这一极线距离以及两个二维点各自的图像特征信息实现二维点的图像特征信息的更新。这是考虑到不同视角下的相机所对应的极线距离可以体现不同目标点之间的关系,以两个相机(分别为相机1和相机2)和两个目标点(点A和点B)而言,相机1视角下的点A对应到相机2视角下为一条线(极线),该极线与相机2视角里点B的距离(对应极线距离)决定了两个点之间的靠近程度,利用极线距离进行二维点的特征更新可以使得所更新的图像特征信息的内容更为丰富,从而更有利于实现后续的三维点的确定。Here, the epipolar distance between two 2D points can be determined based on the respective 2D coordinate information of a 2D point and other 2D points, and then based on the epipolar distance and the respective image features of the two 2D points The information realizes the updating of image feature information of two-dimensional points. This is considering that the epipolar distances corresponding to the cameras under different viewing angles can reflect the relationship between different target points, with two cameras (respectively camera 1 and camera 2) and two target points (point A and point B) In other words, point A in the perspective of camera 1 corresponds to a line (polar line) in the perspective of camera 2, and the distance between the epipolar line and point B in the perspective of camera 2 (corresponding to the epipolar distance) determines the distance between the two points. The degree of proximity, using the epipolar distance to update the feature of the two-dimensional point can make the content of the updated image feature information richer, which is more conducive to the determination of the subsequent three-dimensional point.
在进行两个二维点的图像特征信息匹配的过程中,可以直接选取对应二维点的更新的图像特征信息进行匹配,而无需对所有的二维点进行更新,这将可以提升整体的检测效率。In the process of matching the image feature information of two two-dimensional points, the updated image feature information corresponding to the two-dimensional point can be directly selected for matching without updating all the two-dimensional points, which will improve the overall detection efficiency.
第二方面:本公开实施例提供的三维点检测的方法可以先进行特征更新,而后再进行成对的二维点的确定,可以通过步骤来确定成对的二维点:The second aspect: the method of 3D point detection provided by the embodiment of the present disclosure can perform feature update first, and then determine the paired 2D points, and the paired 2D points can be determined through the following steps:
步骤一、针对多个目标图像中的第一目标图像,基于第一目标图像中提取的图像特征信息、以及多个目标图像中除第一目标图像之外的其它目标图像中提取的图像特征信息,对第一目标图像中多个二维点的图像特征信息进行更新,得到第一目标图像中多个二维点分别对应的更新后的图像特征信息; Step 1. For the first target image among the multiple target images, based on the image feature information extracted from the first target image and the image feature information extracted from other target images except the first target image among the multiple target images , updating image feature information of multiple two-dimensional points in the first target image to obtain updated image feature information respectively corresponding to multiple two-dimensional points in the first target image;
步骤二、基于多个目标图像分别对应的更新后的图像特征信息,确定属于同一目标对象的成对的二维点。Step 2: Determine pairs of two-dimensional points belonging to the same target object based on the updated image feature information respectively corresponding to the multiple target images.
这里,可以基于多个目标图像中除第一目标图像之外的其它目标图像中提取的图像特征信息对第一目标图像中多个二维点的图像特征信息进行更新,而后从多个目标图像中任意选取两个目标图像,并从选取的两个目标图像中分别选取对应的两个二维点,并将选取的两个二维点分别对应的更新后的图像特征信息输入预先训练好的特征匹配网络中,以确定选取的两个二维点是否为属于同一目标对象的成对的二维点。Here, the image feature information of multiple two-dimensional points in the first target image can be updated based on the image feature information extracted from other target images in the multiple target images except the first target image, and then from the multiple target images Select two target images arbitrarily, and select two corresponding two-dimensional points from the two selected target images, and input the updated image feature information corresponding to the two selected two-dimensional points into the pre-trained In the feature matching network, it is determined whether the selected two 2D points are paired 2D points belonging to the same target object.
这样,可以在对各个目标图像中的各个二维点进行更新的情况下,基于更新后的图像特征信息来确定属于同一目标对象的成对的二维点,提升二维点配对的准确性。In this way, when each 2D point in each target image is updated, paired 2D points belonging to the same target object can be determined based on the updated image feature information, thereby improving the accuracy of 2D point pairing.
其中,有关第一目标图像中多个二维点中每个二维点的图像特征信息的更新过程可以参见上述第一方面的描述。For the update process of the image feature information of each of the multiple 2D points in the first target image, reference may be made to the description of the first aspect above.
本公开实施例在进行成对的二维点的确定的过程中,可以是从多个目标图像中任意选取两个目标图像,并从选取的两个目标图像中分别选取对应的两个二维点,以利用预先训练好的特征匹配网络进行特征匹配的验证,有关验证的过程参见上述第一方面的说明。In the embodiment of the present disclosure, in the process of determining paired two-dimensional points, two target images may be arbitrarily selected from a plurality of target images, and two corresponding two-dimensional points may be respectively selected from the selected two target images. point to use the pre-trained feature matching network to verify the feature matching. For the verification process, please refer to the description of the first aspect above.
这样,可以基于选取操作实现两个目标图像中对应两个二维点的匹配操作,且一旦确定出两个目标图像中的两个二维点匹配成功,即可以锁定到一个目标对象,相比遍历 操作实现匹配而言,运算量显著降低。In this way, the matching operation corresponding to two 2D points in the two target images can be realized based on the selection operation, and once it is determined that the two 2D points in the two target images are successfully matched, it can be locked to a target object. For traversal operations to achieve matching, the amount of computation is significantly reduced.
需要说明的是,这里在进行特征匹配的验证过程中,可以是任意选取两个目标图像,而后再任选对应的两个二维点,一旦这两个二维点的图像特征得以成功匹配,则可以基于这一成对的二维点确定对应的目标对象的候选三维点,而无需验证所有的配对情况,这将提升整体的检测效率。It should be noted that in the verification process of feature matching here, two target images can be selected arbitrarily, and then two corresponding two-dimensional points can be selected. Once the image features of these two two-dimensional points are successfully matched, Then, the candidate 3D points of the corresponding target object can be determined based on the paired 2D points without verifying all the pairing situations, which will improve the overall detection efficiency.
本公开实施例中,在基于构建的候选三维点确定每个目标对象的目标三维点的过程中,可以先确定对应的候选三维空间,而后基于三维空间至二维空间的投影操作,实现每个目标对象的目标三维点的三维坐标信息的确定。针对每个目标对象,可以通过如下步骤确定目标三维点的三维坐标信息:In the embodiment of the present disclosure, in the process of determining the target 3D point of each target object based on the constructed candidate 3D points, the corresponding candidate 3D space can be determined first, and then based on the projection operation from the 3D space to the 2D space, each The determination of the three-dimensional coordinate information of the target three-dimensional point of the target object. For each target object, the three-dimensional coordinate information of the target three-dimensional point can be determined through the following steps:
步骤一、将目对象的标候选三维空间进行空间采样,确定多个采样点; Step 1. Carry out spatial sampling of the candidate three-dimensional space of the target object, and determine a plurality of sampling points;
步骤二、针对多个采样点中的每个采样点,基于采样点在候选三维空间内的三维坐标信息、以及目标图像,确定采样点对应的三维点检测结果;Step 2, for each sampling point in the plurality of sampling points, based on the three-dimensional coordinate information of the sampling point in the candidate three-dimensional space and the target image, determine the three-dimensional point detection result corresponding to the sampling point;
步骤三、基于得到的三维点检测结果,确定每个目标对象的目标三维点的三维坐标信息。Step 3: Determine the three-dimensional coordinate information of the target three-dimensional point of each target object based on the obtained three-dimensional point detection result.
这里,针对每个目标对象对应的候选三维空间可以进行适应性的采样,先在搜索空间中等距采样,而后基于该采样点在候选三维空间内的三维坐标信息、以及多个目标图像,确定每个采样点对应的三维点检测结果,这样,可以实现在重建出的候选三维点周围的进一步精细采样,从而可以获得更精确的目标三维点位置。Here, adaptive sampling can be carried out for the candidate 3D space corresponding to each target object. First, equidistant sampling is performed in the search space, and then based on the 3D coordinate information of the sampling point in the candidate 3D space and multiple target images, determine each In this way, further fine sampling around the reconstructed candidate 3D point can be realized, so that a more accurate target 3D point position can be obtained.
并且,针对每个目标对象可以确定对应的候选三维空间,并基于对候选三维空间的采样实现有关三维点的检测,相比整个体素空间的操作,针对候选三维空间的采样操作显著提升了检测的效率。Moreover, for each target object, the corresponding candidate three-dimensional space can be determined, and the detection of relevant three-dimensional points can be realized based on the sampling of the candidate three-dimensional space. Compared with the operation of the entire voxel space, the sampling operation for the candidate three-dimensional space significantly improves the detection s efficiency.
本公开实施例中,可以通过如下步骤确定采样点对应的三维点检测结果:In the embodiment of the present disclosure, the three-dimensional point detection results corresponding to the sampling points can be determined through the following steps:
步骤一、针对多个采样点中的每个采样点,基于候选三维空间所在三维坐标系与各个视角所在二维坐标系之间的对应关系,将三维坐标信息投影至不同视角下,确定采样点分别在多个目标图像中的二维投影点信息; Step 1. For each sampling point among multiple sampling points, based on the corresponding relationship between the three-dimensional coordinate system of the candidate three-dimensional space and the two-dimensional coordinate system of each viewing angle, project the three-dimensional coordinate information to different viewing angles to determine the sampling point Two-dimensional projection point information in multiple target images respectively;
步骤二、基于采样点分别在多个目标图像中的二维投影点信息,确定采样点在不同视角下的采样点特征信息;Step 2, based on the two-dimensional projection point information of the sampling points in multiple target images, determine the feature information of the sampling points under different viewing angles;
步骤三、基于采样点在不同视角下的采样点特征信息,确定该采样点对应的三维点检测结果。Step 3: Determine the 3D point detection result corresponding to the sampling point based on the feature information of the sampling point under different viewing angles.
本公开实施例提供的三维点检测的方法可以先确定采样点分别在多个目标图像中的二维投影点信息,并基于二维投影点信息,确定采样点在不同视角下的采样点特征信息。The 3D point detection method provided by the embodiments of the present disclosure can first determine the two-dimensional projection point information of the sampling point in multiple target images, and then determine the sampling point feature information of the sampling point under different viewing angles based on the two-dimensional projection point information .
本公开实施例利用采样点在不同视角下的采样点特征信息可以确定采样点在不同视角下的连接关系,这样的连接关系将有助于确定出更为准确的采样点特征信息,进一步使得所确定的三维点检测结果和准确度得以提升。In the embodiment of the present disclosure, the connection relationship of sampling points under different viewing angles can be determined by using the sampling point feature information of sampling points under different viewing angles. Such connection relationship will help to determine more accurate sampling point feature information, and further make all The determined 3D point detection results and accuracy are improved.
其中,有关二维投影点信息可以是基于采样点所在三维坐标系与目标图像所在二维坐标系之间的转换关系确定的,也即,利用转换关系可以将采样点投影到目标图像上,从而确定采样点在目标图像上的二维投影点的图像位置等信息。Wherein, the information about the two-dimensional projection point can be determined based on the conversion relationship between the three-dimensional coordinate system where the sampling point is located and the two-dimensional coordinate system where the target image is located, that is, the sampling point can be projected onto the target image by using the conversion relationship, thereby Determine the image position and other information of the two-dimensional projection point of the sampling point on the target image.
基于采样点分别在多个目标图像中的二维投影点信息,可以确定采样点在不同视角下的采样点特征信息,这里所确定的采样点特征信息可以是融合不同视角的特征信息,这是考虑到针对同一目标对象而言,在不同视角下,对应采样点之间存在一定的连接关系,利用这一连接关系可以实现有关采样点特征的更新,除此以外,在同一视角下,对应采样点之间也存在一定的连接关系,利用这一连接关系也可以实现有关采样点特征的更新,从而使得所确定的采样点特征信息更为贴合符合目标对象的实际三维信息。Based on the two-dimensional projection point information of the sampling points in multiple target images, the feature information of the sampling points under different viewing angles can be determined. The feature information of the sampling points determined here can be the feature information of different viewing angles. This is Considering that for the same target object, there is a certain connection relationship between the corresponding sampling points under different perspectives, using this connection relationship can realize the update of the characteristics of the sampling points. In addition, under the same perspective, the corresponding sampling points There is also a certain connection relationship between the points, which can also be used to update the characteristics of the sampling points, so that the determined feature information of the sampling points is more in line with the actual 3D information of the target object.
考虑到采样点的采样点特征信息的确定对于三维点检测的关键作用,接下来可以对确定采样点特征信息的过程进行详细描述。Considering that the determination of the characteristic information of the sampling point plays a key role in the three-dimensional point detection, the process of determining the characteristic information of the sampling point can be described in detail next.
上述确定采样点特征信息的过程包括如下步骤:The above-mentioned process of determining the characteristic information of the sampling points includes the following steps:
步骤一、提取多个目标图像分别对应的图像特征; Step 1, extracting image features respectively corresponding to a plurality of target images;
步骤二、针对多个目标图像中的每个目标图像,基于采样点在多个目标图像中的二维投影点的图像位置信息,从目标图像对应的图像特征中提取与图像位置信息对应的图像特征;Step 2. For each of the multiple target images, based on the image position information of the two-dimensional projection points of the sampling points in the multiple target images, extract an image corresponding to the image position information from the image features corresponding to the target image feature;
步骤三、将提取的与图像位置信息对应的图像特征,确定采样点在不同视角下的采样点特征信息。Step 3: The extracted image features corresponding to the image position information are used to determine the feature information of the sampling points under different viewing angles.
这里,可以基于采样点在多个目标图像中的二维投影点信息与图像特征之间的对应关系,确定与采样点匹配的采样点特征信息,操作简单。Here, the characteristic information of the sampling point matching the sampling point can be determined based on the correspondence between the two-dimensional projection point information of the sampling point in multiple target images and the image features, and the operation is simple.
本公开实施例提供的三维点检测的方法,为了提取与采样点匹配的采样点特征信息,可以基于采样点在多个目标图像中的二维投影点的图像位置信息,从该目标图像对应的图像特征中提取与图像位置信息对应的图像特征,并将提取的该图像特征作为与采样点匹配的采样点特征信息。The 3D point detection method provided by the embodiment of the present disclosure, in order to extract the feature information of the sampling point matching the sampling point, can be based on the image position information of the 2D projection point of the sampling point in multiple target images, from the corresponding target image The image feature corresponding to the image position information is extracted from the image feature, and the extracted image feature is used as the feature information of the sampling point matched with the sampling point.
其中,有关目标图像对应的图像特征可以是基于图像处理得到的,也可以是基于训练好的特征提取网络提取得到的,还可以是其它能够提取出表征目标对象、目标对象所在场景等各种信息的其它方法确定的,本公开实施例对此不做限制。Among them, the image features corresponding to the target image can be obtained based on image processing, or extracted based on a trained feature extraction network, or other information that can be extracted to represent the target object, the scene where the target object is located, etc. determined by other methods, which are not limited in this embodiment of the present disclosure.
为了确定出更为准确的目标对象的目标三维点,这里,可以先对采样点的采样点特征信息进行更新,而后基于更新采样点特征信息,确定采样点对应的三维点检测结果,可以通过如下步骤来确定采样点对应的三维点检测结果:In order to determine a more accurate target 3D point of the target object, here, the sampling point feature information of the sampling point can be updated first, and then based on the updated sampling point feature information, the corresponding 3D point detection result of the sampling point can be determined through the following Steps to determine the 3D point detection results corresponding to the sampling points:
步骤一、基于采样点在不同视角下的采样点特征信息,以及与采样点关联的其他采样点的采样点特征信息,确定采样点在不同视角下的更新采样点特征信息; Step 1. Based on the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points associated with the sampling point, determine the updated sampling point feature information of the sampling point under different viewing angles;
步骤二、基于采样点对应的更新采样点特征信息,确定该采样点对应的三维点检测结果。Step 2: Based on the updated sampling point feature information corresponding to the sampling point, determine the three-dimensional point detection result corresponding to the sampling point.
这里,可以利用采样点在不同视角下的采样点特征信息,以及与该采样点关联的其他采样点的采样点特征信息进行该采样点的采样点特征信息的更新,更新采样点特征信息一定程度上包括了一个视图内的其它采样点的特征,还包括了不同视图间的采样点的特征,使得采样点的特征更趋近于准确,进而使得所确定的三维姿态信息也更为准确。Here, the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points associated with the sampling point can be used to update the sampling point feature information of the sampling point, and update the sampling point feature information to a certain extent. The above includes the features of other sampling points in one view, and also includes the features of sampling points between different views, making the features of sampling points more accurate, and thus making the determined 3D pose information more accurate.
并且,与采样点关联的其他采样点可以是与采样点存在连接关系的采样点,这里的连接关系对应的是同一视图下采样点之间的连接关系,而对于采样点在不同视角下的采样点特征信息而言,可以确定的是不同视图下针对同一采样点所确定的二维投影点之间的连接关系。针对目标视角下的一个采样点,可以通过如下步骤来进行采样点特征信息的更新:In addition, other sampling points associated with the sampling point may be sampling points that have a connection relationship with the sampling point. The connection relationship here corresponds to the connection relationship between the sampling points in the same view, and for sampling points under different viewing angles In terms of point feature information, what can be determined is the connection relationship between the two-dimensional projection points determined for the same sampling point under different views. For a sampling point under the target perspective, the feature information of the sampling point can be updated through the following steps:
步骤一、基于采样点在不同视角下的采样点特征信息以及采样点在不同视角下的各个二维投影点之间的第一连接关系对采样点在不同视角下的采样点特征信息进行第一更新,得到第一更新后的采样点特征信息;以及,基于采样点在目标视角下的采样点特征信息以及与采样点同属于目标视角、且与采样点存在第二连接关系的其他采样点的采样点特征信息对采样点在目标视角下的采样点特征信息进行第二更新,得到第二更新后的采样点特征信息; Step 1. Based on the sampling point feature information of the sampling point under different viewing angles and the first connection relationship between the two-dimensional projection points of the sampling point under different viewing angles, the sampling point feature information of the sampling point under different viewing angles is first performed. update, to obtain the first updated sampling point feature information; and, based on the sampling point feature information of the sampling point under the target perspective and the sampling points belonging to the target perspective and other sampling points that have a second connection relationship with the sampling point The sampling point characteristic information performs a second update on the sampling point characteristic information of the sampling point under the target perspective to obtain the second updated sampling point characteristic information;
步骤二、基于第一更新后的采样点特征信息以及第二更新后的采样点特征信息,确定采样点在目标视角下的更新采样点特征信息。Step 2: Based on the first updated sampling point feature information and the second updated sampling point feature information, determine the updated sampling point feature information of the sampling point under the target perspective.
其中,采样点在不同视角下的各个二维投影点之间的第一连接关系是预先确定的,基于第一连接关系可以实现对一个视角下的采样点特征信息进行更新,也即,第一更新 后的采样点特征信息融合了其它视图下同一采样点的采样点特征。另外,基于同属于目标视角、且与采样点存在第二连接关系的其他采样点的采样点特征信息可以对采样点的采样点特征信息进行更新,这里的第二连接关系也可以是预先确定的,这样所确定的第二更新后的采样点特征信息融合了同一视图的其它采样点的采样点特征。Wherein, the first connection relationship between the two-dimensional projection points of the sampling point under different viewing angles is predetermined, based on the first connection relationship, the feature information of the sampling point under one viewing angle can be updated, that is, the first The updated sampling point feature information is fused with the sampling point features of the same sampling point in other views. In addition, the sampling point feature information of the sampling point can be updated based on the sampling point feature information of other sampling points that belong to the target perspective and have a second connection relationship with the sampling point, where the second connection relationship can also be predetermined , so that the determined second updated sampling point feature information incorporates the sampling point features of other sampling points in the same view.
结合第一更新后的采样点特征信息以及第二更新后的采样点特征信息,可以使得所确定的采样点在目标视角下的更新采样点特征信息更为准确。有关采样点在其它视角下的更新可以参照上述描述内容述。Combining the first updated sampling point feature information and the second updated sampling point feature information can make the updated sampling point feature information of the determined sampling point under the target view angle more accurate. For updates of sampling points in other perspectives, refer to the above description.
在实际应用中,可以利用图神经网络(Graph Neural Network,GNN)实现上述采样点特征信息的更新。这里,在进行特征更新之前,可以基于上述第一连接关系、第二连接关系以及采样点特征信息构建图模型,通过对图模型进行卷积运算,不断更新采样点的采样点特征信息。In practical applications, Graph Neural Network (GNN) can be used to update the feature information of the above sampling points. Here, before the feature update, a graph model can be constructed based on the first connection relationship, the second connection relationship, and the feature information of the sampling point, and the feature information of the sample point of the sample point can be continuously updated by performing a convolution operation on the graph model.
针对同一目标对象而言,可以将对应该目标对象的所有采样点分别对应的更新采样点特征信息输入到三维点检测网络中,通过对每个采样点对应的为点的预测概率确定出对应于目标对象的三维点检测结果。这里,可以将预测概率最大的采样点的三维坐标信息确定为对应目标对象的目标三维点的三维坐标信息。For the same target object, the updated sampling point feature information corresponding to all the sampling points of the target object can be input into the three-dimensional point detection network, and the corresponding 3D point detection results of the target object. Here, the three-dimensional coordinate information of the sampling point with the highest prediction probability may be determined as the three-dimensional coordinate information of the target three-dimensional point corresponding to the target object.
为了便于进一步理解本公开实施例提供的三维点检测的方法,接下来可以结合图2进一步进行说明。In order to facilitate further understanding of the three-dimensional point detection method provided by the embodiments of the present disclosure, further description may be made in conjunction with FIG. 2 .
如图2上部的图中的虚线所示,利用三个目标图像中分别对应的两个2D中心点,来构造图模型G={V,E}。其中,节点V对应的是各个视角下2D中心点的图像特征信息,边E对应的是节点之间的关系,可以是2D中心点之间的极线距离。As shown by the dotted line in the upper part of FIG. 2 , the graph model G={V, E} is constructed by using two 2D center points respectively corresponding to the three target images. Among them, the node V corresponds to the image feature information of the 2D center points at each viewing angle, and the edge E corresponds to the relationship between the nodes, which can be the epipolar distance between the 2D center points.
构造好图模型之后,可以进行2D中心点在不同视角下的图像特征信息的更新,这里,可以利用图神经网络201来实现特征的更新。对于更新得到的更新后的图像特征信息可以利用特征匹配网络202来确定每一对2D中心点(即一个边)是否属于同一个目标对象,经过特征更新和特征匹配可以得到如图2下部的图中实线所示的配对关系。After the graph model is constructed, the image feature information of the 2D center point under different viewing angles can be updated. Here, the graph neural network 201 can be used to update the feature. For the updated image feature information obtained by updating, the feature matching network 202 can be used to determine whether each pair of 2D center points (that is, a side) belongs to the same target object, and after feature updating and feature matching, the lower part of Figure 2 can be obtained. The pairing relationship shown by the solid line.
利用图2下部的图所示的配对关系,针对每个目标对象可以确定出候选三维空间,如图2下部的图中虚线所指向的圆球三维空间。Using the pairing relationship shown in the lower part of FIG. 2 , a candidate three-dimensional space can be determined for each target object, such as the spherical three-dimensional space pointed by the dotted line in the lower part of FIG. 2 .
本公开实施例中,通过对该三维空间进行空间采样,并基于有关三维坐标系与二维坐标系之间的转换关系,可以实现有关采样点对应的三维中心点检测结果的确定,进而可以确定出每个目标对象的目标三维中心点的三维坐标信息。In the embodiment of the present disclosure, by sampling the three-dimensional space and based on the conversion relationship between the three-dimensional coordinate system and the two-dimensional coordinate system, the determination of the detection result of the three-dimensional center point corresponding to the relevant sampling point can be realized, and then it can be determined The three-dimensional coordinate information of the target three-dimensional center point of each target object is obtained.
由于本公开实施例提供的三维点检测的方法能够在候选三维空间内进一步搜索每个目标对象的目标三维点,这一定程度上可以降低由于重建出来的候选三维点不够准确所带来的重建误差,除此之外,即使在进行重建之前发生配对匹配错误的情况下,也可以通过各个目标对象所在候选三维空间的搜索操作实现目标对象的目标三维点的确定,例如,在目标对象A和目标对象B发生配对错误,目标对象B和目标对象C的配对正确的情况下,可以基于配对正确的搜索结果来校验配对错误的搜索结果,进一步提升多个目标对象的检测准确度。Since the 3D point detection method provided by the embodiment of the present disclosure can further search for the target 3D point of each target object in the candidate 3D space, this can reduce the reconstruction error caused by the inaccurate reconstructed candidate 3D point to a certain extent. , in addition, even if a pairing error occurs before reconstruction, the target 3D point of the target object can be determined through the search operation of the candidate 3D space where each target object is located, for example, between the target object A and the target Object B has a pairing error, and when the pairing of target object B and target object C is correct, the search result of pairing error can be verified based on the search result of correct pairing, and the detection accuracy of multiple target objects can be further improved.
基于上述确定的各个目标对象的目标三维点的三维坐标信息,可以确定各个目标对象的大致位置,进而可以实现后续的多人姿态识别。本公开实施例中,还可以通过连续多帧的目标图像中有关多个目标对象的目标三维点的三维坐标信息的分析,确定各个目标对象的行驶轨迹,除此之外,还可以实现其它相关应用。Based on the above determined three-dimensional coordinate information of the target three-dimensional points of each target object, the approximate position of each target object can be determined, and subsequent multi-person gesture recognition can be realized. In the embodiment of the present disclosure, it is also possible to determine the driving trajectory of each target object through the analysis of the three-dimensional coordinate information of the target three-dimensional points of multiple target objects in the target image of multiple consecutive frames. In addition, other correlations can also be realized. application.
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的执行顺序应当以其功能和可能的内在逻辑确定。Those skilled in the art can understand that in the above-mentioned method of specific implementation, the writing order of each step does not imply a strict execution order and constitutes any limitation on the implementation process, and the execution order of each step should be based on its function and possible internal Logically OK.
基于同一发明构思,本公开实施例中还提供了与三维点检测的方法对应的三维点检 测的装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述三维点检测的方法相似,因此装置的实施可以参见方法的实施。Based on the same inventive concept, the embodiment of the present disclosure also provides a three-dimensional point detection device corresponding to the three-dimensional point detection method, because the problem-solving principle of the device in the embodiment of the present disclosure is the same as the above-mentioned three-dimensional point detection method of the embodiment of the present disclosure Similarly, the implementation of the device can refer to the implementation of the method.
参照图3所示,为本公开实施例提供的一种三维点检测的装置的示意图,装置包括:获取部分301、检测部分302;其中,Referring to FIG. 3 , it is a schematic diagram of a three-dimensional point detection device provided by an embodiment of the present disclosure. The device includes: an acquisition part 301 and a detection part 302; wherein,
获取部分301,被配置为获取多个视角下对多个目标对象进行拍摄得到的目标图像,以及基于获取的目标图像确定的多个目标对象中每个目标对象的候选三维点的三维坐标信息;The acquiring part 301 is configured to acquire target images obtained by shooting multiple target objects under multiple viewing angles, and three-dimensional coordinate information of candidate three-dimensional points of each of the multiple target objects determined based on the acquired target images;
检测部分302,被配置为针对每个目标对象,基于目标对象的候选三维点的三维坐标信息,确定目标对象对应的候选三维空间;基于目标对象对应的候选三维空间、以及目标图像,确定目标对象的目标三维点的三维坐标信息。The detection part 302 is configured to, for each target object, determine the candidate three-dimensional space corresponding to the target object based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object; determine the target object based on the candidate three-dimensional space corresponding to the target object and the target image The three-dimensional coordinate information of the target three-dimensional point.
本公开实施例利用目标对象的候选三维点所在候选三维空间与多个视角下的目标图像之间的投影关系,可以准确的对每个目标对象的三维点进行检测,与此同时,针对候选三维点在候选三维空间内的投影操作避免了整个空间的体素化操作,这将显著提升检测的效率。The embodiments of the present disclosure can accurately detect the 3D points of each target object by utilizing the projection relationship between the candidate 3D space where the candidate 3D points of the target object are located and the target images under multiple viewing angles. At the same time, for the candidate 3D points The projection operation of the point in the candidate 3D space avoids the voxelization operation of the entire space, which will significantly improve the detection efficiency.
在一种可能的实施方式中,三维点包括三维中心点;候选三维点包括候选三维中心点,且目标对象的候选三维中心点位于目标对象的中心位置;目标三维点包括目标三维中心点。In a possible implementation manner, the 3D point includes a 3D center point; the candidate 3D point includes a candidate 3D center point, and the candidate 3D center point of the target object is located at the center of the target object; the target 3D point includes the target 3D center point.
在一种可能的实施方式中,检测部分302,被配置为按照如下步骤基于目标对象对应的候选三维空间、以及目标图像,确定目标对象的目标三维点的三维坐标信息,包括:In a possible implementation manner, the detection part 302 is configured to determine the three-dimensional coordinate information of the target three-dimensional point of the target object based on the candidate three-dimensional space corresponding to the target object and the target image according to the following steps, including:
将目标对象的候选三维空间进行空间采样,确定多个采样点;Sampling the candidate three-dimensional space of the target object to determine multiple sampling points;
针对多个采样点中的每个采样点,基于采样点在候选三维空间内的三维坐标信息、以及目标图像,确定采样点对应的三维点检测结果;For each sampling point in the plurality of sampling points, based on the three-dimensional coordinate information of the sampling point in the candidate three-dimensional space and the target image, determine the three-dimensional point detection result corresponding to the sampling point;
基于得到的三维点检测结果,确定目标对象的目标三维点的三维坐标信息。Based on the obtained 3D point detection result, determine the 3D coordinate information of the target 3D point of the target object.
在一种可能的实施方式中,检测部分302,被配置为按照如下步骤基于采样点在候选三维空间内的三维坐标信息、以及目标图像,确定采样点对应的三维点检测结果:In a possible implementation manner, the detection part 302 is configured to determine the 3D point detection result corresponding to the sampling point based on the 3D coordinate information of the sampling point in the candidate 3D space and the target image according to the following steps:
针对多个采样点中的每个采样点,基于候选三维空间所在三维坐标系与各个视角所在二维坐标系之间的对应关系,将三维坐标信息投影至不同视角下,确定采样点分别在多个目标图像中的二维投影点信息;For each sampling point in multiple sampling points, based on the corresponding relationship between the three-dimensional coordinate system of the candidate three-dimensional space and the two-dimensional coordinate system of each viewing angle, the three-dimensional coordinate information is projected to different viewing angles, and the sampling point is determined to be in multiple Two-dimensional projected point information in a target image;
基于采样点分别在多个目标图像中的二维投影点信息,确定采样点在不同视角下的采样点特征信息;Based on the two-dimensional projection point information of the sampling points in multiple target images, the feature information of the sampling points under different viewing angles is determined;
基于采样点在不同视角下的采样点特征信息,确定采样点对应的三维点检测结果。Based on the feature information of the sampling points under different viewing angles, the 3D point detection results corresponding to the sampling points are determined.
在一种可能的实施方式中,二维投影点信息包括二维投影点的图像位置信息;检测部分302,被配置为按照如下步骤基于采样点分别在多个目标图像中的二维投影点信息,确定采样点在不同视角下的采样点特征信息:In a possible implementation manner, the two-dimensional projection point information includes image position information of the two-dimensional projection point; the detection part 302 is configured to follow the steps below based on the two-dimensional projection point information of the sampling points respectively in multiple target images , to determine the feature information of the sampling point under different viewing angles:
提取多个目标图像分别对应的图像特征;Extract image features corresponding to multiple target images;
针对多个目标图像中的每个目标图像,基于采样点在多个目标图像中的二维投影点的图像位置信息,从目标图像对应的图像特征中提取与图像位置信息对应的图像特征;For each target image in the multiple target images, based on the image position information of the two-dimensional projection points of the sampling points in the multiple target images, extracting image features corresponding to the image position information from image features corresponding to the target image;
将提取的与图像位置信息对应的图像特征,确定采样点在不同视角下的采样点特征信息。The extracted image features corresponding to the image position information are used to determine the feature information of the sampling points under different viewing angles.
在一种可能的实施方式中,检测部分302,被配置为按照如下步骤基于采样点在不同视角下的采样点特征信息,确定采样点对应的三维点检测结果:In a possible implementation manner, the detection part 302 is configured to determine the three-dimensional point detection result corresponding to the sampling point based on the sampling point feature information of the sampling point under different viewing angles according to the following steps:
基于采样点在不同视角下的采样点特征信息,以及与采样点关联的其他采样点的采样点特征信息,确定采样点在不同视角下的更新采样点特征信息;Based on the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points associated with the sampling point, determine the updated sampling point feature information of the sampling point under different viewing angles;
基于采样点对应的更新采样点特征信息,确定采样点对应的三维点检测结果。Based on the updated sampling point feature information corresponding to the sampling point, a three-dimensional point detection result corresponding to the sampling point is determined.
在一种可能的实施方式中,获取部分301,被配置为按照如下步骤确定每个目标对象的候选三维点的三维坐标信息:In a possible implementation manner, the acquisition part 301 is configured to determine the three-dimensional coordinate information of the candidate three-dimensional points of each target object according to the following steps:
从多个目标图像中分别提取多个二维点的图像特征信息,其中,每个二维点是位于对应的目标对象中的像素点;extracting image feature information of a plurality of two-dimensional points from the plurality of target images, wherein each two-dimensional point is a pixel located in a corresponding target object;
基于从多个目标图像中分别提取的图像特征信息,确定属于同一目标对象的成对的二维点,其中,成对的二维点来自不同的目标图像;Determining paired two-dimensional points belonging to the same target object based on the image feature information respectively extracted from the plurality of target images, wherein the paired two-dimensional points are from different target images;
根据确定的成对的二维点在各自目标图像中的二维坐标信息,确定同一目标对象的候选三维点的三维坐标信息。The three-dimensional coordinate information of the candidate three-dimensional points of the same target object is determined according to the determined two-dimensional coordinate information of the paired two-dimensional points in the respective target images.
在一种可能的实施方式中,获取部分301,被配置为按照如下步骤基于从多个目标图像中分别提取的图像特征信息,确定属于同一目标对象的成对的二维点:In a possible implementation manner, the acquiring part 301 is configured to determine pairs of two-dimensional points belonging to the same target object based on image feature information respectively extracted from multiple target images according to the following steps:
将目标图像进行两两组合,得到至少一组目标图像;Combining the target images in pairs to obtain at least one set of target images;
基于多个目标图像中的二维点的图像特征信息,确定每组目标图像中是否存在图像特征匹配的两个二维点;两个二维点分别属于同一组目标图像中的不同目标图像;Based on the image feature information of the two-dimensional points in the plurality of target images, determine whether there are two two-dimensional points matching the image features in each group of target images; the two two-dimensional points respectively belong to different target images in the same group of target images;
在确定每组目标图像中存在图像特征匹配的两个二维点的情况下,将图像特征匹配的两个二维点确定为属于同一目标对象的成对的二维点。If it is determined that there are two two-dimensional points with matching image features in each group of target images, the two two-dimensional points with matching image features are determined as a pair of two-dimensional points belonging to the same target object.
在一种可能的实施方式中,获取部分301,被配置为按照如下步骤基于多个目标图像中的二维点的图像特征信息,确定每组目标图像中是否存在图像特征匹配的两个二维点:In a possible implementation manner, the acquiring part 301 is configured to determine whether there are two 2D points whose image features match in each group of target images based on the image feature information of multiple target images according to the following steps: point:
针对每组目标图像,将该组目标图像的两个目标图像中的二维点两两组合,得到多组二维点;基于每组二维点包括的两个二维点的图像特征信息,确定该组目标图像中是否存在图像特征匹配的两个二维点。For each group of target images, combining the two-dimensional points in the two target images of the group of target images in pairs to obtain multiple groups of two-dimensional points; based on the image feature information of the two two-dimensional points included in each group of two-dimensional points, Determine whether there are two 2D points in the set of target images that match the image features.
在一种可能的实施方式中,获取部分301,被配置为按照如下步骤基于每组二维点包括的两个二维点的图像特征信息,确定该组目标图像中是否存在图像特征匹配的两个二维点:In a possible implementation manner, the acquisition part 301 is configured to determine whether there are two pairs of image features matching in the group of target images based on the image feature information of two two-dimensional points included in each group of two-dimensional points according to the following steps: two-dimensional points:
针对每组二维点,将该组二维点的两个二维点的图像特征信息输入到特征匹配网络中,确定两个二维点的图像特征信息是否匹配;For each group of two-dimensional points, input the image feature information of two two-dimensional points of the group of two-dimensional points into the feature matching network, and determine whether the image feature information of the two two-dimensional points matches;
在确定两个二维点的图像特征信息匹配的情况下,将图像特征匹配的任一组的两个二维点确定为该组目标图像中存在图像特征匹配的两个二维点。In the case of determining that the image feature information of two two-dimensional points matches, any group of two two-dimensional points whose image features match is determined as two two-dimensional points whose image features match exist in the group of target images.
在一种可能的实施方式中,获取部分301,被配置为按照如下步骤将该组二维点的两个二维点的图像特征信息输入到特征匹配网络中,确定两个二维点的图像特征信息是否匹配:In a possible implementation manner, the acquisition part 301 is configured to input the image feature information of two two-dimensional points of the group of two-dimensional points into the feature matching network according to the following steps, and determine the images of the two two-dimensional points Whether the feature information matches:
针对该组二维点中的每个二维点,基于该二维点的图像特征信息,以及与该二维点所在目标图像不同的其它目标图像中其它二维点的图像特征信息,对该二维点的图像特征信息进行更新,得到更新后的图像特征信息;For each two-dimensional point in the group of two-dimensional points, based on the image feature information of the two-dimensional point and the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, the The image feature information of the two-dimensional point is updated to obtain the updated image feature information;
将两个二维点分别对应的更新后的图像特征信息输入到特征匹配网络中,确定两个二维点的图像特征信息是否匹配。Input the updated image feature information respectively corresponding to the two two-dimensional points into the feature matching network, and determine whether the image feature information of the two two-dimensional points matches.
在一种可能的实施方式中,获取部分301,被配置为按照如下步骤基于从多个目标图像中分别提取的图像特征信息,确定属于同一目标对象的成对的二维点:In a possible implementation manner, the acquiring part 301 is configured to determine pairs of two-dimensional points belonging to the same target object based on image feature information respectively extracted from multiple target images according to the following steps:
针对多个目标图像中的第一目标图像,基于第一目标图像中提取的图像特征信息、以及多个目标图像中除第一目标图像之外的其它目标图像中提取的图像特征信息,对第一目标图像中多个二维点的图像特征信息进行更新,得到第一目标图像中多个二维点分别对应的更新后的图像特征信息;For the first target image among the multiple target images, based on the image feature information extracted from the first target image and the image feature information extracted from other target images in the multiple target images except the first target image, the second The image feature information of multiple two-dimensional points in a target image is updated to obtain updated image feature information respectively corresponding to multiple two-dimensional points in the first target image;
基于多个目标图像分别对应的更新后的图像特征信息,确定属于同一目标对象的成对的二维点。Based on the updated image feature information respectively corresponding to the plurality of target images, pairs of two-dimensional points belonging to the same target object are determined.
在一种可能的实施方式中,获取部分301,被配置为按照如下步骤基于多个目标图像分别更新后的图像特征信息,确定属于同一目标对象的成对的二维点:In a possible implementation manner, the acquisition part 301 is configured to determine the paired two-dimensional points belonging to the same target object based on the image feature information respectively updated in multiple target images according to the following steps:
从多个目标图像中任意选取两个目标图像,并从选取的两个目标图像中分别选取对应的两个二维点;randomly selecting two target images from a plurality of target images, and selecting two corresponding two-dimensional points from the selected two target images;
将选取的两个二维点分别对应的更新后的图像特征信息输入预先训练好的特征匹配网络中,并在确定网络输出特征匹配成功的情况下,确定选取的两个二维点为属于同一目标对象的成对的二维点。Input the updated image feature information corresponding to the selected two two-dimensional points into the pre-trained feature matching network, and determine that the selected two two-dimensional points belong to the same Pairs of 2D points of the target object.
在一种可能的实施方式中,获取部分301,被配置为按照如下步骤对二维点的图像特征信息进行更新:In a possible implementation manner, the acquisition part 301 is configured to update the image feature information of the two-dimensional point according to the following steps:
基于二维点在对应目标图像中的二维坐标信息、以及与二维点所在目标图像不同的其它目标图像中其它二维点的二维坐标信息,确定二维点与其它二维点之间的极线距离;Based on the two-dimensional coordinate information of the two-dimensional point in the corresponding target image and the two-dimensional coordinate information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, determine the distance between the two-dimensional point and other two-dimensional points the epipolar distance;
基于二维点的图像特征信息、与二维点所在目标图像不同的其它目标图像中其它二维点的图像特征信息、以及极线距离对二维点的图像特征信息进行更新,得到更新后的图像特征信息。Based on the image feature information of the two-dimensional point, the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the epipolar distance, the image feature information of the two-dimensional point is updated to obtain the updated Image feature information.
关于装置中的各部分的处理流程、以及各部分之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。For the description of the processing flow of each part in the device and the interaction flow between each part, reference may be made to the relevant description in the above method embodiment, and details are not described here again.
本公开实施例还提供了一种电子设备,如图4所示,为本公开实施例提供的电子设备结构示意图,包括:处理器401、存储器402、和总线403。存储器402存储有处理器401可执行的机器可读指令(比如,图3中的装置中获取部分301、检测部分302对应的执行指令等),在电子设备运行的情况下,处理器401与存储器402之间通过总线403通信,机器可读指令被处理器401执行时执行如下处理:An embodiment of the present disclosure also provides an electronic device, as shown in FIG. 4 , which is a schematic structural diagram of the electronic device provided by the embodiment of the present disclosure, including: a processor 401 , a memory 402 , and a bus 403 . The memory 402 stores machine-readable instructions executable by the processor 401 (for example, execution instructions corresponding to the acquisition part 301 and the detection part 302 in the device in FIG. 3 ), and when the electronic device is running, the processor 401 and the memory 402 communicates through the bus 403, and when the machine-readable instructions are executed by the processor 401, the following processing is performed:
获取多个视角下对多个目标对象进行拍摄得到的目标图像,以及基于获取的目标图像确定的多个目标对象中每个目标对象的候选三维点的三维坐标信息;Acquiring target images obtained by shooting multiple target objects under multiple viewing angles, and 3D coordinate information of candidate 3D points of each of the multiple target objects determined based on the acquired target images;
针对每个目标对象,执行如下步骤:For each target audience, perform the following steps:
基于目标对象的候选三维点的三维坐标信息,确定目标对象对应的候选三维空间;基于目标对象对应的候选三维空间、以及目标图像,确定目标对象的目标三维点的三维坐标信息。Based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object; based on the candidate three-dimensional space corresponding to the target object and the target image, determine the three-dimensional coordinate information of the target three-dimensional point of the target object.
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的三维点检测的方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method for three-dimensional point detection described in the above-mentioned method embodiments are executed. . Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.
本公开实施例还提供一种计算机程序产品,该计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在计算机上运行的情况下,使得所述计算机执行上述方法实施例中所述的三维点检测的方法的步骤,可参见上述方法实施例。An embodiment of the present disclosure also provides a computer program product, the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, the computer executes the method described in the above method embodiment. For the steps of the method for three-dimensional point detection, refer to the foregoing method embodiments.
其中,上述计算机程序产品可以通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品体现为计算机存储介质,在另一个可选实施例中,计算机程序产品体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。Wherein, the above-mentioned computer program product may be realized by hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and the like.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的工作过程,可以参考前述方法实施例中的对应过程。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例是示意性的,例如,所述单元的划分,为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是 电性,机械或其它的形式。Those skilled in the art can clearly understand that for the convenience and brevity of description, for the working process of the above-described system and device, reference may be made to the corresponding process in the foregoing method embodiments. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. The device embodiments described above are schematic. For example, the division of the units is a logical function division. In actual implementation, there may be another division method. For example, multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台电子设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。If the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the essence of the technical solution of the present disclosure or the part that contributes to the related technology or the part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several The instructions are used to make an electronic device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
前述的计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备,可为易失性存储介质或非易失性存储介质。计算机可读存储介质例如可以是但不限于:电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM或闪存)、静态随机存取存储器(Static Random-Access Memory,SRAM)、便携式压缩盘只读存储器(Compact Disk Read Only Memory,CD-ROM)、数字多功能盘(Digital versatile Disc,DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。The aforementioned computer-readable storage medium may be a tangible device capable of retaining and storing instructions used by an instruction execution device, and may be a volatile storage medium or a nonvolatile storage medium. A computer readable storage medium may be, for example but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the above. More specific examples (non-exhaustive list) of computer-readable storage media include: portable computer disk, hard disk, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), erasable Type programmable read-only memory (Erasable Programmable Read Only Memory, EPROM or flash memory), static random-access memory (Static Random-Access Memory, SRAM), portable compact disk read-only memory (Compact Disk Read Only Memory, CD-ROM) , Digital versatile discs (Digital versatile Disc, DVD), memory sticks, floppy disks, mechanically encoded devices, such as punched cards or raised structures in grooves with instructions stored thereon, and any suitable combination of the foregoing. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
最后应说明的是:以上所述实施例,仅为本公开实施例的具体实施方式,用以说明本公开实施例的技术方案,而非对其限制,本公开实施例的保护范围并不局限于此,尽管参照前述实施例对本公开实施例进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开实施例揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开实施例的保护范围之内。因此,本公开实施例的保护范围应所述以权利要求的保护范围为准。Finally, it should be noted that the above-mentioned embodiments are only specific implementation methods of the embodiments of the present disclosure, and are used to illustrate the technical solutions of the embodiments of the present disclosure, rather than limiting them, and the protection scope of the embodiments of the present disclosure is not limited Here, although the embodiments of the present disclosure have been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that any person familiar with the technical field can still understand The technical solutions described in the foregoing embodiments are modified or easily conceivable changes are made, or equivalent replacements are made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present disclosure. Both the spirit and scope should fall within the protection scope of the embodiments of the present disclosure. Therefore, the protection scope of the embodiments of the present disclosure should be based on the protection scope of the claims.
工业实用性Industrial Applicability
本公开实施例获取多个视角下对多个目标对象进行拍摄得到的目标图像,以及基于获取的所述目标图像确定的所述多个目标对象中每个目标对象的候选三维点的三维坐标信息;针对所述每个目标对象,执行如下步骤:基于所述目标对象的候选三维点的三维坐标信息,确定所述目标对象对应的候选三维空间;基于所述目标对象对应的候选三维空间、以及所述目标图像,确定所述目标对象的目标三维点的三维坐标信息。如此,利用目标对象的候选三维点所在候选三维空间与多个视角下的目标图像之间的投影关系,可以准确的对每个目标对象的三维点进行检测,与此同时,针对候选三维点在候选三维空间内的投影操作避免了整个空间的体素化操作,这将显著提升检测的效率。The embodiment of the present disclosure acquires target images obtained by shooting multiple target objects under multiple viewing angles, and 3D coordinate information of candidate 3D points of each of the multiple target objects determined based on the acquired target images ; For each target object, perform the following steps: based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object; based on the candidate three-dimensional space corresponding to the target object, and The target image determines the three-dimensional coordinate information of the target three-dimensional point of the target object. In this way, using the projection relationship between the candidate 3D space where the candidate 3D point of the target object is located and the target images under multiple viewing angles, the 3D point of each target object can be accurately detected. At the same time, for the candidate 3D point in The projection operation in the candidate 3D space avoids the voxelization operation of the entire space, which will significantly improve the detection efficiency.

Claims (18)

  1. 一种三维点检测的方法,所述方法包括:A method for three-dimensional point detection, the method comprising:
    获取多个视角下对多个目标对象进行拍摄得到的目标图像,以及基于获取的所述目标图像确定的所述多个目标对象中每个目标对象的候选三维点的三维坐标信息;Acquiring target images obtained by shooting multiple target objects under multiple viewing angles, and three-dimensional coordinate information of candidate three-dimensional points of each of the multiple target objects determined based on the acquired target images;
    针对所述每个目标对象,基于所述目标对象的候选三维点的三维坐标信息,确定所述目标对象对应的候选三维空间;For each target object, based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object;
    基于所述目标对象对应的候选三维空间、以及所述目标图像,确定所述目标对象的目标三维点的三维坐标信息。Based on the candidate three-dimensional space corresponding to the target object and the target image, three-dimensional coordinate information of a target three-dimensional point of the target object is determined.
  2. 根据权利要求1所述的方法,其中,所述三维点包括三维中心点;所述候选三维点包括候选三维中心点,且所述目标对象的所述候选三维中心点位于所述目标对象的中心位置;所述目标三维点包括目标三维中心点。The method according to claim 1, wherein the 3D point comprises a 3D center point; the candidate 3D point comprises a candidate 3D center point, and the candidate 3D center point of the target object is located at the center of the target object position; the target three-dimensional point includes the target three-dimensional center point.
  3. 根据权利要求1或2所述的方法,其中,所述基于所述目标对象对应的候选三维空间、以及所述目标图像,确定所述目标对象的目标三维点的三维坐标信息,包括:The method according to claim 1 or 2, wherein the determining the three-dimensional coordinate information of the target three-dimensional point of the target object based on the candidate three-dimensional space corresponding to the target object and the target image comprises:
    将所述目标对象的候选三维空间进行空间采样,确定多个采样点;performing spatial sampling on the candidate three-dimensional space of the target object, and determining a plurality of sampling points;
    针对所述多个采样点中的每个采样点,基于所述采样点在所述候选三维空间内的三维坐标信息以及所述目标图像,确定所述采样点对应的三维点检测结果;For each sampling point in the plurality of sampling points, based on the three-dimensional coordinate information of the sampling point in the candidate three-dimensional space and the target image, determine a three-dimensional point detection result corresponding to the sampling point;
    基于得到的所述三维点检测结果,确定所述目标对象的目标三维点的三维坐标信息。Based on the obtained 3D point detection result, determine the 3D coordinate information of the target 3D point of the target object.
  4. 根据权利要求3所述的方法,其中,所述基于所述采样点在所述候选三维空间内的三维坐标信息以及所述目标图像,确定所述采样点对应的三维点检测结果,包括:The method according to claim 3, wherein the determining the 3D point detection result corresponding to the sampling point based on the 3D coordinate information of the sampling point in the candidate 3D space and the target image comprises:
    针对所述多个采样点中的每个采样点,基于所述候选三维空间所在三维坐标系与各个视角所在二维坐标系之间的对应关系,将所述三维坐标信息投影至不同视角下,确定所述采样点分别在多个所述目标图像中的二维投影点信息;For each sampling point in the plurality of sampling points, based on the corresponding relationship between the three-dimensional coordinate system of the candidate three-dimensional space and the two-dimensional coordinate system of each viewing angle, project the three-dimensional coordinate information to different viewing angles, determining the two-dimensional projection point information of the sampling points in the multiple target images;
    基于所述采样点分别在多个所述目标图像中的二维投影点信息,确定所述采样点在不同视角下的采样点特征信息;Based on the two-dimensional projection point information of the sampling point in the multiple target images, determine the sampling point feature information of the sampling point under different viewing angles;
    基于所述采样点在不同视角下的采样点特征信息,确定所述采样点对应的三维点检测结果。A three-dimensional point detection result corresponding to the sampling point is determined based on the sampling point feature information of the sampling point under different viewing angles.
  5. 根据权利要求4所述的方法,其中,所述二维投影点信息包括二维投影点的图像位置信息;所述基于所述采样点分别在多个所述目标图像中的二维投影点信息,确定所述采样点在不同视角下的采样点特征信息,包括:The method according to claim 4, wherein the two-dimensional projection point information includes image position information of the two-dimensional projection point; the two-dimensional projection point information in a plurality of target images based on the sampling points respectively , determine the sampling point feature information of the sampling point under different viewing angles, including:
    提取多个所述目标图像分别对应的图像特征;extracting image features respectively corresponding to a plurality of the target images;
    针对多个所述目标图像中的每个所述目标图像,基于所述采样点在多个所述目标图像中的二维投影点的图像位置信息,从所述目标图像对应的图像特征中提取与所述图像位置信息对应的图像特征;For each of the plurality of target images, based on the image position information of the two-dimensional projection point of the sampling point in the plurality of target images, extracting from the image features corresponding to the target image image features corresponding to the image location information;
    将提取的与所述图像位置信息对应的图像特征,确定所述采样点在不同视角下的采样点特征信息。The extracted image feature corresponding to the image position information is used to determine the sampling point feature information of the sampling point under different viewing angles.
  6. 根据权利要求4或5所述的方法,其中,所述基于所述采样点在不同视角下的采样点特征信息,确定所述采样点对应的三维点检测结果,包括:The method according to claim 4 or 5, wherein said determining the three-dimensional point detection result corresponding to the sampling point based on the sampling point feature information of the sampling point under different viewing angles comprises:
    基于所述采样点在不同视角下的采样点特征信息,以及其他采样点的采样点特征信息,确定所述采样点在不同视角下的更新采样点特征信息;所述其他采样点为与所述采样点关联的采样点;Based on the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points, determine the updated sampling point feature information of the sampling point under different viewing angles; the other sampling points are related to the The sampling point associated with the sampling point;
    基于所述采样点对应的更新采样点特征信息,确定所述采样点对应的三维点检测结果。Based on the updated sampling point feature information corresponding to the sampling point, a three-dimensional point detection result corresponding to the sampling point is determined.
  7. 根据权利要求1至6任一所述的方法,其中,确定所述每个目标对象的候选三 维点的三维坐标信息,包括:The method according to any one of claims 1 to 6, wherein determining the three-dimensional coordinate information of the candidate three-dimensional points of each target object comprises:
    从多个所述目标图像中分别提取多个二维点的图像特征信息,其中,所述多个二维点中的每个二维点是位于对应的目标对象中的像素点;extracting image feature information of a plurality of two-dimensional points from the plurality of target images, wherein each of the plurality of two-dimensional points is a pixel located in a corresponding target object;
    基于从多个所述目标图像中分别提取的图像特征信息,确定属于同一目标对象的成对的二维点,其中,所述成对的二维点来自不同的目标图像;Determining paired two-dimensional points belonging to the same target object based on the image feature information respectively extracted from a plurality of the target images, wherein the paired two-dimensional points are from different target images;
    根据确定的所述成对的二维点在各自目标图像中的二维坐标信息,确定所述同一目标对象的候选三维点的三维坐标信息。According to the determined two-dimensional coordinate information of the paired two-dimensional points in the respective target images, the three-dimensional coordinate information of the candidate three-dimensional points of the same target object is determined.
  8. 根据权利要求7所述的方法,其中,所述基于从多个所述目标图像中分别提取的图像特征信息,确定属于同一目标对象的成对的二维点,包括:The method according to claim 7, wherein said determining paired two-dimensional points belonging to the same target object based on image feature information respectively extracted from a plurality of said target images comprises:
    将所述目标图像进行两两组合,得到至少一组目标图像;Combining the target images in pairs to obtain at least one set of target images;
    基于多个所述目标图像中的二维点的图像特征信息,确定所述至少一组目标图像中的每组所述目标图像中是否存在图像特征匹配的两个二维点;所述两个二维点分别属于同一组目标图像中的不同目标图像;Based on the image feature information of the two-dimensional points in the multiple target images, determine whether there are two two-dimensional points with matching image features in each group of the target images in the at least one set of target images; The two-dimensional points respectively belong to different target images in the same group of target images;
    在确定每组所述目标图像中存在图像特征匹配的两个二维点的情况下,将图像特征匹配的两个二维点确定为所述属于同一目标对象的成对的二维点。If it is determined that there are two two-dimensional points with matching image features in each group of target images, the two two-dimensional points with matching image features are determined as the paired two-dimensional points belonging to the same target object.
  9. 根据权利要求8所述的方法,其中,所述基于多个所述目标图像中的二维点的图像特征信息,确定所述至少一组目标图像中的每组所述目标图像中是否存在图像特征匹配的两个二维点,包括:The method according to claim 8, wherein, based on the image feature information of two-dimensional points in a plurality of the target images, it is determined whether there is an image in each group of the target images in the at least one set of target images Two 2D points for feature matching, including:
    针对所述至少一组目标图像中的目标图像组,将所述目标图像组的两个目标图像中的二维点两两组合,得到多组二维点;基于所述多组二维点中的二维点组包括的两个二维点的图像特征信息,确定所述目标图像组中是否存在图像特征匹配的两个二维点。For the target image group in the at least one group of target images, combining the two-dimensional points in the two target images of the target image group to obtain multiple sets of two-dimensional points; based on the multiple sets of two-dimensional points The image feature information of the two 2D points included in the 2D point group, and determine whether there are two 2D points with matching image features in the target image group.
  10. 根据权利要求9所述的方法,其中,所述基于所述多组二维点中每组二维点包括的两个二维点的图像特征信息,确定所述目标图像组中是否存在图像特征匹配的两个二维点,包括:The method according to claim 9, wherein, based on the image feature information of two two-dimensional points included in each group of two-dimensional points in the multiple groups of two-dimensional points, it is determined whether there is an image feature in the target image group Two 2D points to match, including:
    针对所述多组二维点中的二维点组,将所述二维点组的两个二维点的图像特征信息输入到特征匹配网络中,确定所述两个二维点的图像特征信息是否匹配;For the two-dimensional point group in the plurality of groups of two-dimensional points, the image feature information of the two two-dimensional points of the two-dimensional point group is input into the feature matching network, and the image features of the two two-dimensional points are determined Whether the information matches;
    在确定所述两个二维点的图像特征信息匹配的情况下,将图像特征匹配的两个二维点确定为所述目标图像组中存在图像特征匹配的两个二维点。When it is determined that the image feature information of the two two-dimensional points matches, the two two-dimensional points whose image features match are determined as the two two-dimensional points whose image features match exist in the target image group.
  11. 根据权利要求10所述的方法,其中,所述将所述二维点组的两个二维点的图像特征信息输入到特征匹配网络中,确定所述两个二维点的图像特征信息是否匹配,包括:The method according to claim 10, wherein said inputting the image feature information of two two-dimensional points of said two-dimensional point group into a feature matching network, determining whether the image feature information of said two two-dimensional points matches, including:
    针对所述二维点组中的每个二维点,基于所述二维点的图像特征信息,以及其它目标图像中其它二维点的图像特征信息,对所述二维点的图像特征信息进行更新,得到更新后的图像特征信息;所述其它目标图像为与所述二维点所在目标图像不同的目标图像;For each two-dimensional point in the two-dimensional point group, based on the image feature information of the two-dimensional point and the image feature information of other two-dimensional points in other target images, the image feature information of the two-dimensional point performing an update to obtain updated image feature information; the other target image is a target image different from the target image where the two-dimensional point is located;
    将所述两个二维点分别对应的更新后的图像特征信息输入到特征匹配网络中,确定所述两个二维点的图像特征信息是否匹配。Inputting the updated image feature information respectively corresponding to the two two-dimensional points into the feature matching network, and determining whether the image feature information of the two two-dimensional points matches.
  12. 根据权利要求7所述的方法,其中,所述基于从多个所述目标图像中分别提取的图像特征信息,确定属于同一目标对象的成对的二维点,包括:The method according to claim 7, wherein said determining paired two-dimensional points belonging to the same target object based on image feature information respectively extracted from a plurality of said target images comprises:
    针对多个所述目标图像中的第一目标图像,基于所述第一目标图像中提取的图像特征信息以及其它目标图像中提取的图像特征信息,对所述第一目标图像中多个所述二维点的图像特征信息进行更新,得到所述第一目标图像中多个所述二维点分别对应的更新后的图像特征信息;所述其它目标图像为所述多个目标图像中除所述第一目标图像之外的目标图像;For the first target image among the multiple target images, based on the image feature information extracted from the first target image and the image feature information extracted from other target images, the multiple target images in the first target image The image feature information of the two-dimensional points is updated to obtain the updated image feature information respectively corresponding to a plurality of the two-dimensional points in the first target image; the other target images are all but one of the multiple target images an object image other than the first object image;
    基于多个所述目标图像分别对应的更新后的图像特征信息,确定属于同一目标对象的成对的二维点。Based on the updated image feature information respectively corresponding to the plurality of target images, pairs of two-dimensional points belonging to the same target object are determined.
  13. 根据权利要求12所述的方法,其中,所述基于多个所述目标图像分别对应的更新后的图像特征信息,确定属于同一目标对象的成对的二维点,包括:The method according to claim 12, wherein said determining paired two-dimensional points belonging to the same target object based on the updated image feature information respectively corresponding to a plurality of said target images comprises:
    从多个所述目标图像中任意选取两个目标图像,并从任意选取的所述两个目标图像中分别选取对应的两个二维点;randomly selecting two target images from a plurality of the target images, and selecting corresponding two two-dimensional points from the arbitrarily selected two target images;
    将选取的所述两个二维点分别对应的更新后的图像特征信息输入预先训练好的特征匹配网络中,并在确定网络输出特征匹配成功的情况下,确定选取的所述两个二维点为属于同一目标对象的成对的二维点。Input the updated image feature information corresponding to the selected two two-dimensional points into the pre-trained feature matching network, and determine the selected two two-dimensional points when the network output feature matching is successful. Points are pairs of 2D points belonging to the same target object.
  14. 根据权利要求11至13任一所述的方法,其中,更新所述二维点的图像特征信息,包括:The method according to any one of claims 11 to 13, wherein updating the image feature information of the two-dimensional point includes:
    基于所述二维点在对应目标图像中的二维坐标信息以及其它目标图像中其它二维点的二维坐标信息,确定所述二维点与所述其它二维点之间的极线距离;所述其它目标图像为与所述二维点所在目标图像不同的目标图像;Based on the two-dimensional coordinate information of the two-dimensional point in the corresponding target image and the two-dimensional coordinate information of other two-dimensional points in other target images, determine the epipolar distance between the two-dimensional point and the other two-dimensional point ; The other target image is a target image different from the target image where the two-dimensional point is located;
    基于所述二维点的图像特征信息、与所述二维点所在目标图像不同的其它目标图像中其它二维点的图像特征信息以及所述极线距离对所述二维点的图像特征信息进行更新,得到更新后的图像特征信息。Based on the image feature information of the two-dimensional point, image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the epipolar distance to the image feature information of the two-dimensional point Perform an update to obtain updated image feature information.
  15. 一种三维点检测的装置,所述装置包括:A device for three-dimensional point detection, the device comprising:
    获取部分,被配置为获取多个视角下对多个目标对象进行拍摄得到的目标图像,以及基于获取的所述目标图像确定的所述多个目标对象中每个目标对象的候选三维点的三维坐标信息;The acquiring part is configured to acquire a target image obtained by shooting multiple target objects under multiple viewing angles, and a 3D position of a candidate 3D point of each of the multiple target objects determined based on the acquired target images. coordinate information;
    检测部分,被配置为针对所述每个目标对象,基于所述目标对象的候选三维点的三维坐标信息,确定所述目标对象对应的候选三维空间;基于所述目标对象对应的候选三维空间、以及所述目标图像,确定所述目标对象的目标三维点的三维坐标信息。The detection part is configured to, for each target object, determine a candidate three-dimensional space corresponding to the target object based on three-dimensional coordinate information of a candidate three-dimensional point of the target object; based on the candidate three-dimensional space corresponding to the target object, As well as the target image, determine the three-dimensional coordinate information of the target three-dimensional point of the target object.
  16. 一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,在电子设备运行的情况下,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至14任一所述的三维点检测的方法的步骤。An electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the memory are connected through Bus communication, when the machine-readable instructions are executed by the processor, the steps of the three-dimensional point detection method according to any one of claims 1 to 14 are executed.
  17. 一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至14任一所述的三维点检测的方法的步骤。A computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the three-dimensional point detection method according to any one of claims 1 to 14 are executed.
  18. 一种计算机程序产品,所述计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在计算机上运行的情况下,使得所述计算机执行如权利要求1至14任一所述的三维点检测的方法的步骤。A computer program product, the computer program product comprising a computer program or instruction, when the computer program or instruction is run on a computer, the computer is made to execute the three-dimensional point as claimed in any one of claims 1 to 14 The steps of the detection method.
PCT/CN2022/088149 2021-08-13 2022-04-21 Three-dimensional point detection method and apparatus, electronic device, and storage medium WO2023015938A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110929512.6A CN113610967B (en) 2021-08-13 2021-08-13 Three-dimensional point detection method, three-dimensional point detection device, electronic equipment and storage medium
CN202110929512.6 2021-08-13

Publications (1)

Publication Number Publication Date
WO2023015938A1 true WO2023015938A1 (en) 2023-02-16

Family

ID=78340615

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/088149 WO2023015938A1 (en) 2021-08-13 2022-04-21 Three-dimensional point detection method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN113610967B (en)
WO (1) WO2023015938A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610967B (en) * 2021-08-13 2024-03-26 北京市商汤科技开发有限公司 Three-dimensional point detection method, three-dimensional point detection device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180218513A1 (en) * 2017-02-02 2018-08-02 Intel Corporation Method and system of automatic object dimension measurement by using image processing
CN109766882A (en) * 2018-12-18 2019-05-17 北京诺亦腾科技有限公司 Label identification method, the device of human body luminous point
CN112200851A (en) * 2020-12-09 2021-01-08 北京云测信息技术有限公司 Point cloud-based target detection method and device and electronic equipment thereof
CN112950668A (en) * 2021-02-26 2021-06-11 北斗景踪技术(山东)有限公司 Intelligent monitoring method and system based on mold position measurement
CN113610967A (en) * 2021-08-13 2021-11-05 北京市商汤科技开发有限公司 Three-dimensional point detection method and device, electronic equipment and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815754B (en) * 2019-04-12 2023-05-30 Oppo广东移动通信有限公司 Three-dimensional information determining method, three-dimensional information determining device and terminal equipment
CN111951326A (en) * 2019-05-15 2020-11-17 北京地平线机器人技术研发有限公司 Target object skeleton key point positioning method and device based on multiple camera devices
WO2021046716A1 (en) * 2019-09-10 2021-03-18 深圳市大疆创新科技有限公司 Method, system and device for detecting target object and storage medium
CN112991440B (en) * 2019-12-12 2024-04-12 纳恩博(北京)科技有限公司 Positioning method and device for vehicle, storage medium and electronic device
WO2021184289A1 (en) * 2020-03-19 2021-09-23 深圳市大疆创新科技有限公司 Methods and device for solving an object and flying around point
CN111582207B (en) * 2020-05-13 2023-08-15 北京市商汤科技开发有限公司 Image processing method, device, electronic equipment and storage medium
CN112528831B (en) * 2020-12-07 2023-11-24 深圳市优必选科技股份有限公司 Multi-target attitude estimation method, multi-target attitude estimation device and terminal equipment
CN112926395A (en) * 2021-01-27 2021-06-08 上海商汤临港智能科技有限公司 Target detection method and device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180218513A1 (en) * 2017-02-02 2018-08-02 Intel Corporation Method and system of automatic object dimension measurement by using image processing
CN109766882A (en) * 2018-12-18 2019-05-17 北京诺亦腾科技有限公司 Label identification method, the device of human body luminous point
CN112200851A (en) * 2020-12-09 2021-01-08 北京云测信息技术有限公司 Point cloud-based target detection method and device and electronic equipment thereof
CN112950668A (en) * 2021-02-26 2021-06-11 北斗景踪技术(山东)有限公司 Intelligent monitoring method and system based on mold position measurement
CN113610967A (en) * 2021-08-13 2021-11-05 北京市商汤科技开发有限公司 Three-dimensional point detection method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"16th European Conference - Computer Vision – ECCV 2020", vol. 12, 1 January 1900, CORNELL UNIVERSITY LIBRARY,, 201 Olin Library Cornell University Ithaca, NY 14853, article TU HANYUE; WANG CHUNYU; ZENG WENJUN: "VoxelPose: Towards Multi-camera 3D Human Pose Estimation in Wild Environment", pages: 197 - 212, XP047593627, DOI: 10.1007/978-3-030-58452-8_12 *
WU SIZE; JIN SHENG; LIU WENTAO; BAI LEI; QIAN CHEN; LIU DONG; OUYANG WANLI: "Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), IEEE, 10 October 2021 (2021-10-10), pages 11128 - 11137, XP034092993, DOI: 10.1109/ICCV48922.2021.01096 *

Also Published As

Publication number Publication date
CN113610967B (en) 2024-03-26
CN113610967A (en) 2021-11-05

Similar Documents

Publication Publication Date Title
WO2020259248A1 (en) Depth information-based pose determination method and device, medium, and electronic apparatus
KR101532864B1 (en) Planar mapping and tracking for mobile devices
CN109934065B (en) Method and device for gesture recognition
CN113196296A (en) Detecting objects in a crowd using geometric context
CN109920055A (en) Construction method, device and the electronic equipment of 3D vision map
CN111444744A (en) Living body detection method, living body detection device, and storage medium
WO2023015903A1 (en) Three-dimensional pose adjustment method and apparatus, electronic device, and storage medium
CN111623765B (en) Indoor positioning method and system based on multi-mode data
WO2023016271A1 (en) Attitude determining method, electronic device, and readable storage medium
CN111094895A (en) System and method for robust self-repositioning in pre-constructed visual maps
US11922658B2 (en) Pose tracking method, pose tracking device and electronic device
Akkaladevi et al. Tracking multiple rigid symmetric and non-symmetric objects in real-time using depth data
WO2017092573A1 (en) In-vivo detection method, apparatus and system based on eyeball tracking
WO2015048046A1 (en) Multiview pruning of feature database for object recognition system
WO2023168957A1 (en) Pose determination method and apparatus, electronic device, storage medium, and program
US20230154115A1 (en) Method and apparatus for providing multi-user-involved augmented reality content for diorama application
JP2014032623A (en) Image processor
WO2023016182A1 (en) Pose determination method and apparatus, electronic device, and readable storage medium
WO2023015938A1 (en) Three-dimensional point detection method and apparatus, electronic device, and storage medium
JP2016500890A (en) A method for initializing and solving surfel local geometry or surface normals using images in a parallelizable architecture
US10242453B2 (en) Simultaneous localization and mapping initialization
CN112270748A (en) Three-dimensional reconstruction method and device based on image
Wang et al. Handling occlusion and large displacement through improved RGB-D scene flow estimation
KR20050027796A (en) Method for recognizing and tracking an object
Tal et al. An accurate method for line detection and manhattan frame estimation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22854942

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE