WO2023015938A1

WO2023015938A1 - Three-dimensional point detection method and apparatus, electronic device, and storage medium

Info

Publication number: WO2023015938A1
Application number: PCT/CN2022/088149
Authority: WO
Inventors: 吴思泽; 金晟; 刘文韬; 钱晨
Original assignee: 上海商汤智能科技有限公司
Priority date: 2021-08-13
Filing date: 2022-04-21
Publication date: 2023-02-16
Also published as: CN113610967B; CN113610967A

Abstract

The present invention provides a three-dimensional point detection method and apparatus, an electronic device, and a storage medium. The method comprises: acquiring a target image obtained by photographing a plurality of target objects at a plurality of viewing angles, and three-dimensional coordinate information of a candidate three-dimensional point of each target object in the plurality of target objects determined on the basis of the obtained target image; and for each target object, performing the following steps: determining a candidate three-dimensional space corresponding to the target object on the basis of the three-dimensional coordinate information of the candidate three-dimensional point of the target object; and determining three-dimensional coordinate information of a target three-dimensional point of the target object on the basis of the candidate three-dimensional space corresponding to the target object and the target image.

Description

Method, device, electronic device and storage medium for three-dimensional point detection

Cross References to Related Applications

This disclosure is based on the Chinese patent application with the application number 202110929512.6, the application date is August 13, 2021, and the application name is "Method, device, electronic equipment and storage medium for three-dimensional point detection", and requires the priority of the Chinese patent application Right, the entire content of this Chinese patent application is hereby incorporated into this disclosure as a reference.

technical field

The present disclosure relates to the technical field of artificial intelligence, and relates to a three-dimensional point detection method, device, electronic equipment and storage medium.

Background technique

Three-dimensional (Three-Dimensional, 3D) human body pose estimation refers to estimating the pose of a human target from an image, video or point cloud, and is often used in various industrial fields such as human body reconstruction, human-computer interaction, behavior recognition, and game modeling. In practical application scenarios, there is often a need for multi-person pose estimation. Among them, human body center point detection can be used as a precursor task for multi-person pose estimation.

A human body center point detection scheme is provided in the related art, which performs multi-view feature extraction based on 3D space voxelization, and detects the body center point through a convolutional neural network (Convolutional Neural Networks, CNN). Among them, spatial voxelization is to divide the 3D space equidistantly into grids of equal size, and the multi-view image features after voxelization can be used as the input of 3D convolution.

However, in the process of voxelization, different targets cannot be effectively distinguished, which will lead to poor accuracy of the detected multiple human center points. At the same time, since the above voxelization is carried out for the entire space Yes, this will consume a lot of computation.

Contents of the invention

Embodiments of the present disclosure at least provide a three-dimensional point detection method, device, electronic device, and storage medium, which improve detection efficiency while improving detection accuracy.

In a first aspect, an embodiment of the present disclosure provides a method for three-dimensional point detection, the method is executed by an electronic device, and the method includes:

Acquiring target images obtained by shooting multiple target objects under multiple viewing angles, and three-dimensional coordinate information of candidate three-dimensional points of each of the multiple target objects determined based on the acquired target images;

For each target object, based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object;

Based on the candidate three-dimensional space corresponding to the target object and the target image, three-dimensional coordinate information of a target three-dimensional point of the target object is determined.

Using the above-mentioned three-dimensional point detection method, in the case of determining the three-dimensional coordinate information of the candidate three-dimensional points of each target object based on the target images obtained by shooting multiple target objects under multiple viewing angles, it can be based on the candidate three-dimensional points of each target object The three-dimensional coordinate information of the three-dimensional point and the target image determine the three-dimensional coordinate information of the target three-dimensional point of each target object.

The embodiments of the present disclosure can accurately detect the 3D points of each target object by utilizing the projection relationship between the candidate 3D space where the candidate 3D points of the target object are located and the target images under multiple viewing angles. At the same time, for the candidate 3D points The projection operation of the point in the candidate 3D space avoids the voxelization operation of the entire space, which will significantly improve the detection efficiency.

In the second aspect, the embodiment of the present disclosure also provides a three-dimensional point detection device, the device includes:

The acquiring part is configured to acquire a target image obtained by shooting multiple target objects under multiple viewing angles, and a 3D position of a candidate 3D point of each of the multiple target objects determined based on the acquired target images. coordinate information;

The detection part is configured to, for each target object, determine a candidate three-dimensional space corresponding to the target object based on three-dimensional coordinate information of a candidate three-dimensional point of the target object; based on the candidate three-dimensional space corresponding to the target object, As well as the target image, determine the three-dimensional coordinate information of the target three-dimensional point of the target object.

In a third aspect, an embodiment of the present disclosure further provides an electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the steps of the three-dimensional point detection method described in any one of the first aspect and its various implementation modes are executed .

In the fourth aspect, the embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and the computer program is executed when the processor runs, as in the first aspect and its various implementation modes The steps of any one of the methods for three-dimensional point detection.

In the fifth aspect, the embodiment of the present disclosure also provides a computer program product, the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, the computer executes the computer program described in the first aspect. The steps of the three-dimensional point detection method described in any one of its various embodiments.

For the effect description of the above-mentioned three-dimensional point detection device, electronic equipment, and computer-readable storage medium, refer to the description of the above-mentioned three-dimensional point detection method.

In order to make the above objects, features and advantages of the embodiments of the present disclosure more comprehensible, the following specific embodiments are described in detail in conjunction with the accompanying drawings.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the accompanying drawings used in the embodiments. The accompanying drawings here are incorporated into the specification and constitute a part of the specification. The drawings show embodiments consistent with the present disclosure, and are used together with the specification to illustrate the technical solutions of the embodiments of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. For those skilled in the art, they can also make Other related drawings are derived from these drawings.

FIG. 1 shows a flow chart of a method for three-dimensional point detection provided by an embodiment of the present disclosure;

FIG. 2 shows a schematic diagram of the application of a method for three-dimensional point detection provided by an embodiment of the present disclosure;

FIG. 3 shows a schematic diagram of a three-dimensional point detection device provided by an embodiment of the present disclosure;

Fig. 4 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are The present disclosure discloses some embodiments, but not all embodiments. The components of the disclosed embodiments generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the claimed disclosure, but represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without making creative efforts belong to the protection scope of the embodiments of the present disclosure.

It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

The term "and/or" in this article describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists independently. Condition. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.

After research, it is found that a human point detection scheme is provided in related technologies, which extracts multi-view features based on 3D space voxelization, and detects human body points through CNN. Among them, spatial voxelization is to divide the 3D space equidistantly into grids of equal size, and the multi-view image features after voxelization can be used as the input of 3D convolution.

However, in the process of voxelization, different targets cannot be effectively distinguished, which will lead to poor accuracy of the detected points. At the same time, since the above voxelization is carried out for the entire space, This will consume a lot of computation.

Based on the above research, the present disclosure provides a method, device, electronic device and storage medium for three-dimensional point detection, which improves detection efficiency while improving the accuracy of point detection.

In order to facilitate the understanding of this embodiment, a method for 3D point detection disclosed in the embodiment of the present disclosure is first introduced in detail. The execution subject of the method for 3D point detection provided in the embodiment of the present disclosure is generally an electronic computer with a certain computing power. equipment, the electronic equipment includes, for example: terminal equipment or server or other processing equipment, the terminal equipment can be user equipment (User Equipment, UE), mobile equipment, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), Handheld devices, computing devices, in-vehicle devices, wearable devices, etc. In some possible implementation manners, the method for detecting a three-dimensional point may be implemented in a manner in which a processor invokes computer-readable instructions stored in a memory.

Referring to FIG. 1, which is a flowchart of a method for three-dimensional point detection provided by an embodiment of the present disclosure, the method includes steps S101 to S104, wherein:

S101: Acquire target images obtained by shooting multiple target objects from multiple viewing angles, and 3D coordinate information of candidate 3D points of each of the multiple target objects determined based on the acquired target images;

S102: For each target object, based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object; based on the candidate three-dimensional space corresponding to the target object and the target image, determine the target three-dimensional point of the target object 3D coordinate information.

In order to facilitate the understanding of the three-dimensional point detection method provided by the embodiment of the present disclosure, the application scenario of the method may be briefly described next. The method of 3D point detection in the embodiments of the present disclosure can be applied to the relevant application fields of multi-person 3D pose estimation. For example, in the field of automatic driving, the 3D pose estimation is performed on multiple pedestrians in front of the self-driving vehicle. Another example is intelligent security. The field estimates the three-dimensional poses of multiple road vehicles, etc., which is not limited in the embodiments of the present disclosure. Next, we will give more examples in the field of autonomous driving.

Considering that in the scheme of combining voxelization and CNN network to determine multi-target central point detection in related technologies, since different targets cannot be effectively distinguished during the process of voxelization, it will result in the detection of multiple points. The accuracy is poor, and at the same time, it will be computationally expensive since the voxelization is done for the entire space. In addition, even if other schemes such as combining epipolar line matching and triangular reconstruction of target points are used, the detection accuracy is poor due to the influence of epipolar line matching.

Just to solve the above problems, the embodiments of the present disclosure provide a detection scheme combined with two-dimensional point matching under multiple perspectives and candidate three-dimensional point reconstruction, which not only improves detection accuracy, but also improves detection efficiency.

Wherein, the target image acquired in the embodiment of the present disclosure may be obtained by shooting multiple target objects under multiple viewing angles, and one viewing angle may correspond to one target image. In the field of automatic driving, the above-mentioned target images can be obtained by synchronously photographing multiple target objects with multiple cameras installed on the vehicle. The multiple cameras here can be selected in combination with different user needs. The three cameras installed correspondingly at the side and the center position aim at the three target images captured by the pedestrians in front. Each target image may correspond to multiple target objects, for example, it may be a captured target image including two pedestrians.

In the embodiment of the present disclosure, the three-dimensional coordinate information of the candidate three-dimensional point of each target object can be determined based on multiple target images captured under multiple viewing angles. In the case of determining the candidate three-dimensional space corresponding to each target object based on the three-dimensional coordinate information In this case, the three-dimensional coordinate information of the target three-dimensional point of each target object can be determined based on the candidate three-dimensional space corresponding to each target object and multiple target images under multiple viewing angles.

Here, the candidate 3D point of the target object can be the candidate 3D central point located at the center of the target object, or other specific points that can characterize the target object. It may be to take a specific point on the pedestrian's head, upper body and lower body. The number of specific points can be set based on different application scenarios, and there is no limitation here. For the convenience of description, the candidate 3D center point is used as the candidate 3D point as an example in the following.

The 3D coordinate information about the candidate 3D points can be obtained by first pairing 2D points based on the target image in the 3D space, and then reconstructing based on the paired 2D points. In addition, it may also be determined based on other methods, which are not limited here.

Here, one or more candidate 3D points can be constructed for each target object. Taking a target object that can construct multiple candidate 3D points as an example, each candidate 3D point in the multiple candidate 3D points of the target object can determine a spherical range with the 3D coordinate information of the candidate 3D point as the center of the sphere, and then The determination of the candidate three-dimensional space corresponding to the target object can be realized by performing a union operation on the spherical ranges determined by the plurality of candidate three-dimensional points of the target object.

The three-dimensional coordinate information of the target three-dimensional point of each target object can be determined based on the projection relationship between the spatial sampling of the corresponding candidate three-dimensional space and the target image under multiple viewing angles. For voxelization, it is only necessary to perform three-dimensional projection on the candidate three-dimensional space where the specified target object is located, that is, more accurate three-dimensional coordinate information of the target three-dimensional point can be determined, and the calculation amount is significantly reduced.

Considering that the determination of the 3D coordinate information of the candidate 3D point of the target object plays a key role in the candidate 3D point detection, the process of determining the 3D coordinate information of the candidate 3D point is described next.

In the embodiment of the present disclosure, the three-dimensional coordinate information of the candidate three-dimensional point of the target object can be determined according to the following steps:

Step 1. Extract image feature information of a plurality of two-dimensional points from a plurality of target images, wherein each two-dimensional point in the plurality of two-dimensional points is a pixel located in a corresponding target object;

Step 2. Determine paired two-dimensional points belonging to the same target object based on image feature information respectively extracted from a plurality of target images, wherein the paired two-dimensional points come from different target images;

Step 3: Determine the 3D coordinate information of the candidate 3D points of the same target object according to the determined 2D coordinate information of the paired 2D points in the respective target images.

Here, the image feature information of multiple two-dimensional points can be extracted from each target image based on the image feature extraction method, or the target image can be directly identified by the two-dimensional point recognition network to determine the image of each two-dimensional point The feature information, the image feature information of the two-dimensional point here may represent the relevant feature of the corresponding target object, for example, it may be the position feature of the center point of the person.

The determination of paired 2D points in the embodiments of the present disclosure can effectively correlate the corresponding relationship of target objects in 2D space, so that the constructed candidate 3D points can point to the same target object to a certain extent, thus providing multiple Accurate detection of target objects provides good data support.

In the embodiment of the present disclosure, the paired 2D points belonging to the same target object can be determined based on the image feature information respectively extracted from multiple target images, where the paired 2D points come from different target images.

In some embodiments, the embodiment of the present disclosure may perform image pairing first and then determine the paired 2D points based on the feature update of the 2D points corresponding to the paired images. On the other hand, all 2D The features of the points are updated, and then the paired two-dimensional points are determined, which is not limited in the embodiments of the present disclosure.

No matter which pairing method is used, since the paired 2D points belong to the same target object, based on the 2D coordinate information of any pair of 2D points in their respective target images, the candidate 3D points corresponding to the target object can be reconstructed 3D coordinate information. In some embodiments, for each pair of 2D points judged to be the same target object, a candidate 3D point can be reconstructed through triangulation, that is, under a multi-camera system, the 2D The 2D coordinates of the point and the camera parameters are used to reconstruct the 3D coordinates corresponding to the two-dimensional point.

Considering that the determination of paired 2D points plays a key role in reconstructing the target 3D point, it can be explained in the following two aspects.

First aspect: The method for detecting three-dimensional points provided by the embodiments of the present disclosure can perform image matching first, and then determine paired two-dimensional points. The paired two-dimensional points can be determined through the following steps:

Step 1, combining target images in pairs to obtain at least one set of target images;

Step 2. Based on the image feature information of the two-dimensional points in the plurality of target images, determine whether there are two two-dimensional points matching the image features in each group of target images in at least one group of target images; the two two-dimensional points belong to Different target images in the same set of target images;

Step 3: When it is determined that there are two 2D points with matching image features in each group of target images, determine the two 2D points with matching image features as a pair of 2D points belonging to the same target object.

Here, multiple target images can be combined in pairs to obtain one or more sets of target images, and then it can be determined whether there are two two-dimensional points that match image features in each set of target images. The matching here can be two two-dimensional points The matching degree of the image feature information of the two-dimensional points is greater than a preset threshold, so that two two-dimensional points whose image features match can be determined as a pair of two-dimensional points belonging to the same target object.

In this way, based on image grouping and image feature matching, the determination of paired two-dimensional points belonging to the same target object is realized, so that the possibility that the determined paired two-dimensional points correspond to the same target object is greatly improved, and then The accuracy of subsequent 3D point detection is improved.

Considering that there are multiple 2D points in each target image, here, for each group of target images, the 2D points in the two target images of the group of target images can be combined in pairs to obtain multiple groups of 2D points. The image feature information of two two-dimensional points included in each group of two-dimensional points in multiple groups of two-dimensional points is compared, that is, the following steps can be used to determine whether there are two two-dimensional points that match the image features in each group of target images. point:

Step 1, for each group of two-dimensional points, input the image feature information of two two-dimensional points of the group of two-dimensional points into the feature matching network, and determine whether the image feature information of the two two-dimensional points matches;

Step 2: In the case of determining that the image feature information of the two 2D points matches, determine the two 2D points whose image features match as the two 2D points whose image features match exist in the group of target images.

Here, a feature matching network can be used to determine whether the image feature information of two two-dimensional points matches. Wherein, the input value of the feature matching network is a set of two-dimensional points corresponding to a set of target images.

In this way, the matching operation of the image feature information of two two-dimensional points in each group of two-dimensional points is realized by combining the feature matching network, and the operation is simple.

In the process of training the feature matching network, it can be trained based on image samples from multiple perspectives and the labeling information of the same target object, that is, the image corresponding to the two-dimensional point is extracted from the image samples from multiple perspectives In the case of feature information, the extracted multiple image feature information can be input into the feature matching network to be trained. When the network output result is inconsistent with the labeling information, the network parameter values of the feature matching network can be adjusted until the network output result is consistent with the labeling information, so as to train the feature matching network.

The trained feature matching network can be used to determine whether the image feature information of two two-dimensional points matches. The image feature matching of two two-dimensional points indicates that the two two-dimensional points correspond to the same target object.

Considering the impact of image feature information of other 2D points in other target images different from the target image where a 2D point is located on this 2D point in practical applications, embodiments of the present disclosure may combine the above-mentioned information of other 2D points The image feature information updates the image feature information of the two-dimensional points, and then inputs the updated image feature information into the feature matching network to determine whether the image feature information of the two two-dimensional points matches.

In this way, the image feature information of the two-dimensional point can be updated based on the image feature information of other two-dimensional points in other target images, so that the accuracy of the determined updated image feature information is higher, and the matching accuracy is further improved.

The embodiments of the present disclosure can implement updating of image feature information of two-dimensional points in the following manner:

Step 1. Based on the two-dimensional coordinate information of the two-dimensional point in the corresponding target image and the two-dimensional coordinate information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, determine the difference between the two-dimensional point and other two-dimensional points. epipolar distance between points;

Step 2. Based on the image feature information of the two-dimensional point, the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the epipolar distance, the image feature information of the two-dimensional point is updated to obtain Updated image feature information.

In this way, the image feature information of the two-dimensional point can be based on the image feature information of the two-dimensional point, the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the image feature information of the two-dimensional point from the epipolar line Update and efficiently integrate multi-view features, and the matching accuracy can be significantly improved.

Here, the epipolar distance between two 2D points can be determined based on the respective 2D coordinate information of a 2D point and other 2D points, and then based on the epipolar distance and the respective image features of the two 2D points The information realizes the updating of image feature information of two-dimensional points. This is considering that the epipolar distances corresponding to the cameras under different viewing angles can reflect the relationship between different target points, with two cameras (respectively camera 1 and camera 2) and two target points (point A and point B) In other words, point A in the perspective of camera 1 corresponds to a line (polar line) in the perspective of camera 2, and the distance between the epipolar line and point B in the perspective of camera 2 (corresponding to the epipolar distance) determines the distance between the two points. The degree of proximity, using the epipolar distance to update the feature of the two-dimensional point can make the content of the updated image feature information richer, which is more conducive to the determination of the subsequent three-dimensional point.

In the process of matching the image feature information of two two-dimensional points, the updated image feature information corresponding to the two-dimensional point can be directly selected for matching without updating all the two-dimensional points, which will improve the overall detection efficiency.

The second aspect: the method of 3D point detection provided by the embodiment of the present disclosure can perform feature update first, and then determine the paired 2D points, and the paired 2D points can be determined through the following steps:

Step 1. For the first target image among the multiple target images, based on the image feature information extracted from the first target image and the image feature information extracted from other target images except the first target image among the multiple target images , updating image feature information of multiple two-dimensional points in the first target image to obtain updated image feature information respectively corresponding to multiple two-dimensional points in the first target image;

Step 2: Determine pairs of two-dimensional points belonging to the same target object based on the updated image feature information respectively corresponding to the multiple target images.

Here, the image feature information of multiple two-dimensional points in the first target image can be updated based on the image feature information extracted from other target images in the multiple target images except the first target image, and then from the multiple target images Select two target images arbitrarily, and select two corresponding two-dimensional points from the two selected target images, and input the updated image feature information corresponding to the two selected two-dimensional points into the pre-trained In the feature matching network, it is determined whether the selected two 2D points are paired 2D points belonging to the same target object.

In this way, when each 2D point in each target image is updated, paired 2D points belonging to the same target object can be determined based on the updated image feature information, thereby improving the accuracy of 2D point pairing.

For the update process of the image feature information of each of the multiple 2D points in the first target image, reference may be made to the description of the first aspect above.

In the embodiment of the present disclosure, in the process of determining paired two-dimensional points, two target images may be arbitrarily selected from a plurality of target images, and two corresponding two-dimensional points may be respectively selected from the selected two target images. point to use the pre-trained feature matching network to verify the feature matching. For the verification process, please refer to the description of the first aspect above.

In this way, the matching operation corresponding to two 2D points in the two target images can be realized based on the selection operation, and once it is determined that the two 2D points in the two target images are successfully matched, it can be locked to a target object. For traversal operations to achieve matching, the amount of computation is significantly reduced.

It should be noted that in the verification process of feature matching here, two target images can be selected arbitrarily, and then two corresponding two-dimensional points can be selected. Once the image features of these two two-dimensional points are successfully matched, Then, the candidate 3D points of the corresponding target object can be determined based on the paired 2D points without verifying all the pairing situations, which will improve the overall detection efficiency.

In the embodiment of the present disclosure, in the process of determining the target 3D point of each target object based on the constructed candidate 3D points, the corresponding candidate 3D space can be determined first, and then based on the projection operation from the 3D space to the 2D space, each The determination of the three-dimensional coordinate information of the target three-dimensional point of the target object. For each target object, the three-dimensional coordinate information of the target three-dimensional point can be determined through the following steps:

Step 1. Carry out spatial sampling of the candidate three-dimensional space of the target object, and determine a plurality of sampling points;

Step 2, for each sampling point in the plurality of sampling points, based on the three-dimensional coordinate information of the sampling point in the candidate three-dimensional space and the target image, determine the three-dimensional point detection result corresponding to the sampling point;

Step 3: Determine the three-dimensional coordinate information of the target three-dimensional point of each target object based on the obtained three-dimensional point detection result.

Here, adaptive sampling can be carried out for the candidate 3D space corresponding to each target object. First, equidistant sampling is performed in the search space, and then based on the 3D coordinate information of the sampling point in the candidate 3D space and multiple target images, determine each In this way, further fine sampling around the reconstructed candidate 3D point can be realized, so that a more accurate target 3D point position can be obtained.

Moreover, for each target object, the corresponding candidate three-dimensional space can be determined, and the detection of relevant three-dimensional points can be realized based on the sampling of the candidate three-dimensional space. Compared with the operation of the entire voxel space, the sampling operation for the candidate three-dimensional space significantly improves the detection s efficiency.

In the embodiment of the present disclosure, the three-dimensional point detection results corresponding to the sampling points can be determined through the following steps:

Step 1. For each sampling point among multiple sampling points, based on the corresponding relationship between the three-dimensional coordinate system of the candidate three-dimensional space and the two-dimensional coordinate system of each viewing angle, project the three-dimensional coordinate information to different viewing angles to determine the sampling point Two-dimensional projection point information in multiple target images respectively;

Step 2, based on the two-dimensional projection point information of the sampling points in multiple target images, determine the feature information of the sampling points under different viewing angles;

Step 3: Determine the 3D point detection result corresponding to the sampling point based on the feature information of the sampling point under different viewing angles.

The 3D point detection method provided by the embodiments of the present disclosure can first determine the two-dimensional projection point information of the sampling point in multiple target images, and then determine the sampling point feature information of the sampling point under different viewing angles based on the two-dimensional projection point information .

In the embodiment of the present disclosure, the connection relationship of sampling points under different viewing angles can be determined by using the sampling point feature information of sampling points under different viewing angles. Such connection relationship will help to determine more accurate sampling point feature information, and further make all The determined 3D point detection results and accuracy are improved.

Wherein, the information about the two-dimensional projection point can be determined based on the conversion relationship between the three-dimensional coordinate system where the sampling point is located and the two-dimensional coordinate system where the target image is located, that is, the sampling point can be projected onto the target image by using the conversion relationship, thereby Determine the image position and other information of the two-dimensional projection point of the sampling point on the target image.

Based on the two-dimensional projection point information of the sampling points in multiple target images, the feature information of the sampling points under different viewing angles can be determined. The feature information of the sampling points determined here can be the feature information of different viewing angles. This is Considering that for the same target object, there is a certain connection relationship between the corresponding sampling points under different perspectives, using this connection relationship can realize the update of the characteristics of the sampling points. In addition, under the same perspective, the corresponding sampling points There is also a certain connection relationship between the points, which can also be used to update the characteristics of the sampling points, so that the determined feature information of the sampling points is more in line with the actual 3D information of the target object.

Considering that the determination of the characteristic information of the sampling point plays a key role in the three-dimensional point detection, the process of determining the characteristic information of the sampling point can be described in detail next.

The above-mentioned process of determining the characteristic information of the sampling points includes the following steps:

Step 1, extracting image features respectively corresponding to a plurality of target images;

Step 2. For each of the multiple target images, based on the image position information of the two-dimensional projection points of the sampling points in the multiple target images, extract an image corresponding to the image position information from the image features corresponding to the target image feature;

Step 3: The extracted image features corresponding to the image position information are used to determine the feature information of the sampling points under different viewing angles.

Here, the characteristic information of the sampling point matching the sampling point can be determined based on the correspondence between the two-dimensional projection point information of the sampling point in multiple target images and the image features, and the operation is simple.

The 3D point detection method provided by the embodiment of the present disclosure, in order to extract the feature information of the sampling point matching the sampling point, can be based on the image position information of the 2D projection point of the sampling point in multiple target images, from the corresponding target image The image feature corresponding to the image position information is extracted from the image feature, and the extracted image feature is used as the feature information of the sampling point matched with the sampling point.

Among them, the image features corresponding to the target image can be obtained based on image processing, or extracted based on a trained feature extraction network, or other information that can be extracted to represent the target object, the scene where the target object is located, etc. determined by other methods, which are not limited in this embodiment of the present disclosure.

In order to determine a more accurate target 3D point of the target object, here, the sampling point feature information of the sampling point can be updated first, and then based on the updated sampling point feature information, the corresponding 3D point detection result of the sampling point can be determined through the following Steps to determine the 3D point detection results corresponding to the sampling points:

Step 1. Based on the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points associated with the sampling point, determine the updated sampling point feature information of the sampling point under different viewing angles;

Step 2: Based on the updated sampling point feature information corresponding to the sampling point, determine the three-dimensional point detection result corresponding to the sampling point.

Here, the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points associated with the sampling point can be used to update the sampling point feature information of the sampling point, and update the sampling point feature information to a certain extent. The above includes the features of other sampling points in one view, and also includes the features of sampling points between different views, making the features of sampling points more accurate, and thus making the determined 3D pose information more accurate.

In addition, other sampling points associated with the sampling point may be sampling points that have a connection relationship with the sampling point. The connection relationship here corresponds to the connection relationship between the sampling points in the same view, and for sampling points under different viewing angles In terms of point feature information, what can be determined is the connection relationship between the two-dimensional projection points determined for the same sampling point under different views. For a sampling point under the target perspective, the feature information of the sampling point can be updated through the following steps:

Step 1. Based on the sampling point feature information of the sampling point under different viewing angles and the first connection relationship between the two-dimensional projection points of the sampling point under different viewing angles, the sampling point feature information of the sampling point under different viewing angles is first performed. update, to obtain the first updated sampling point feature information; and, based on the sampling point feature information of the sampling point under the target perspective and the sampling points belonging to the target perspective and other sampling points that have a second connection relationship with the sampling point The sampling point characteristic information performs a second update on the sampling point characteristic information of the sampling point under the target perspective to obtain the second updated sampling point characteristic information;

Step 2: Based on the first updated sampling point feature information and the second updated sampling point feature information, determine the updated sampling point feature information of the sampling point under the target perspective.

Wherein, the first connection relationship between the two-dimensional projection points of the sampling point under different viewing angles is predetermined, based on the first connection relationship, the feature information of the sampling point under one viewing angle can be updated, that is, the first The updated sampling point feature information is fused with the sampling point features of the same sampling point in other views. In addition, the sampling point feature information of the sampling point can be updated based on the sampling point feature information of other sampling points that belong to the target perspective and have a second connection relationship with the sampling point, where the second connection relationship can also be predetermined , so that the determined second updated sampling point feature information incorporates the sampling point features of other sampling points in the same view.

Combining the first updated sampling point feature information and the second updated sampling point feature information can make the updated sampling point feature information of the determined sampling point under the target view angle more accurate. For updates of sampling points in other perspectives, refer to the above description.

In practical applications, Graph Neural Network (GNN) can be used to update the feature information of the above sampling points. Here, before the feature update, a graph model can be constructed based on the first connection relationship, the second connection relationship, and the feature information of the sampling point, and the feature information of the sample point of the sample point can be continuously updated by performing a convolution operation on the graph model.

For the same target object, the updated sampling point feature information corresponding to all the sampling points of the target object can be input into the three-dimensional point detection network, and the corresponding 3D point detection results of the target object. Here, the three-dimensional coordinate information of the sampling point with the highest prediction probability may be determined as the three-dimensional coordinate information of the target three-dimensional point corresponding to the target object.

In order to facilitate further understanding of the three-dimensional point detection method provided by the embodiments of the present disclosure, further description may be made in conjunction with FIG. 2 .

As shown by the dotted line in the upper part of FIG. 2 , the graph model G={V, E} is constructed by using two 2D center points respectively corresponding to the three target images. Among them, the node V corresponds to the image feature information of the 2D center points at each viewing angle, and the edge E corresponds to the relationship between the nodes, which can be the epipolar distance between the 2D center points.

After the graph model is constructed, the image feature information of the 2D center point under different viewing angles can be updated. Here, the graph neural network 201 can be used to update the feature. For the updated image feature information obtained by updating, the feature matching network 202 can be used to determine whether each pair of 2D center points (that is, a side) belongs to the same target object, and after feature updating and feature matching, the lower part of Figure 2 can be obtained. The pairing relationship shown by the solid line.

Using the pairing relationship shown in the lower part of FIG. 2 , a candidate three-dimensional space can be determined for each target object, such as the spherical three-dimensional space pointed by the dotted line in the lower part of FIG. 2 .

In the embodiment of the present disclosure, by sampling the three-dimensional space and based on the conversion relationship between the three-dimensional coordinate system and the two-dimensional coordinate system, the determination of the detection result of the three-dimensional center point corresponding to the relevant sampling point can be realized, and then it can be determined The three-dimensional coordinate information of the target three-dimensional center point of each target object is obtained.

Since the 3D point detection method provided by the embodiment of the present disclosure can further search for the target 3D point of each target object in the candidate 3D space, this can reduce the reconstruction error caused by the inaccurate reconstructed candidate 3D point to a certain extent. , in addition, even if a pairing error occurs before reconstruction, the target 3D point of the target object can be determined through the search operation of the candidate 3D space where each target object is located, for example, between the target object A and the target Object B has a pairing error, and when the pairing of target object B and target object C is correct, the search result of pairing error can be verified based on the search result of correct pairing, and the detection accuracy of multiple target objects can be further improved.

Based on the above determined three-dimensional coordinate information of the target three-dimensional points of each target object, the approximate position of each target object can be determined, and subsequent multi-person gesture recognition can be realized. In the embodiment of the present disclosure, it is also possible to determine the driving trajectory of each target object through the analysis of the three-dimensional coordinate information of the target three-dimensional points of multiple target objects in the target image of multiple consecutive frames. In addition, other correlations can also be realized. application.

Those skilled in the art can understand that in the above-mentioned method of specific implementation, the writing order of each step does not imply a strict execution order and constitutes any limitation on the implementation process, and the execution order of each step should be based on its function and possible internal Logically OK.

Based on the same inventive concept, the embodiment of the present disclosure also provides a three-dimensional point detection device corresponding to the three-dimensional point detection method, because the problem-solving principle of the device in the embodiment of the present disclosure is the same as the above-mentioned three-dimensional point detection method of the embodiment of the present disclosure Similarly, the implementation of the device can refer to the implementation of the method.

Referring to FIG. 3 , it is a schematic diagram of a three-dimensional point detection device provided by an embodiment of the present disclosure. The device includes: an acquisition part 301 and a detection part 302; wherein,

The acquiring part 301 is configured to acquire target images obtained by shooting multiple target objects under multiple viewing angles, and three-dimensional coordinate information of candidate three-dimensional points of each of the multiple target objects determined based on the acquired target images;

The detection part 302 is configured to, for each target object, determine the candidate three-dimensional space corresponding to the target object based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object; determine the target object based on the candidate three-dimensional space corresponding to the target object and the target image The three-dimensional coordinate information of the target three-dimensional point.

In a possible implementation manner, the 3D point includes a 3D center point; the candidate 3D point includes a candidate 3D center point, and the candidate 3D center point of the target object is located at the center of the target object; the target 3D point includes the target 3D center point.

In a possible implementation manner, the detection part 302 is configured to determine the three-dimensional coordinate information of the target three-dimensional point of the target object based on the candidate three-dimensional space corresponding to the target object and the target image according to the following steps, including:

Sampling the candidate three-dimensional space of the target object to determine multiple sampling points;

For each sampling point in the plurality of sampling points, based on the three-dimensional coordinate information of the sampling point in the candidate three-dimensional space and the target image, determine the three-dimensional point detection result corresponding to the sampling point;

Based on the obtained 3D point detection result, determine the 3D coordinate information of the target 3D point of the target object.

In a possible implementation manner, the detection part 302 is configured to determine the 3D point detection result corresponding to the sampling point based on the 3D coordinate information of the sampling point in the candidate 3D space and the target image according to the following steps:

For each sampling point in multiple sampling points, based on the corresponding relationship between the three-dimensional coordinate system of the candidate three-dimensional space and the two-dimensional coordinate system of each viewing angle, the three-dimensional coordinate information is projected to different viewing angles, and the sampling point is determined to be in multiple Two-dimensional projected point information in a target image;

Based on the two-dimensional projection point information of the sampling points in multiple target images, the feature information of the sampling points under different viewing angles is determined;

Based on the feature information of the sampling points under different viewing angles, the 3D point detection results corresponding to the sampling points are determined.

In a possible implementation manner, the two-dimensional projection point information includes image position information of the two-dimensional projection point; the detection part 302 is configured to follow the steps below based on the two-dimensional projection point information of the sampling points respectively in multiple target images , to determine the feature information of the sampling point under different viewing angles:

Extract image features corresponding to multiple target images;

For each target image in the multiple target images, based on the image position information of the two-dimensional projection points of the sampling points in the multiple target images, extracting image features corresponding to the image position information from image features corresponding to the target image;

The extracted image features corresponding to the image position information are used to determine the feature information of the sampling points under different viewing angles.

In a possible implementation manner, the detection part 302 is configured to determine the three-dimensional point detection result corresponding to the sampling point based on the sampling point feature information of the sampling point under different viewing angles according to the following steps:

Based on the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points associated with the sampling point, determine the updated sampling point feature information of the sampling point under different viewing angles;

Based on the updated sampling point feature information corresponding to the sampling point, a three-dimensional point detection result corresponding to the sampling point is determined.

In a possible implementation manner, the acquisition part 301 is configured to determine the three-dimensional coordinate information of the candidate three-dimensional points of each target object according to the following steps:

extracting image feature information of a plurality of two-dimensional points from the plurality of target images, wherein each two-dimensional point is a pixel located in a corresponding target object;

Determining paired two-dimensional points belonging to the same target object based on the image feature information respectively extracted from the plurality of target images, wherein the paired two-dimensional points are from different target images;

The three-dimensional coordinate information of the candidate three-dimensional points of the same target object is determined according to the determined two-dimensional coordinate information of the paired two-dimensional points in the respective target images.

In a possible implementation manner, the acquiring part 301 is configured to determine pairs of two-dimensional points belonging to the same target object based on image feature information respectively extracted from multiple target images according to the following steps:

Combining the target images in pairs to obtain at least one set of target images;

Based on the image feature information of the two-dimensional points in the plurality of target images, determine whether there are two two-dimensional points matching the image features in each group of target images; the two two-dimensional points respectively belong to different target images in the same group of target images;

If it is determined that there are two two-dimensional points with matching image features in each group of target images, the two two-dimensional points with matching image features are determined as a pair of two-dimensional points belonging to the same target object.

In a possible implementation manner, the acquiring part 301 is configured to determine whether there are two 2D points whose image features match in each group of target images based on the image feature information of multiple target images according to the following steps: point:

For each group of target images, combining the two-dimensional points in the two target images of the group of target images in pairs to obtain multiple groups of two-dimensional points; based on the image feature information of the two two-dimensional points included in each group of two-dimensional points, Determine whether there are two 2D points in the set of target images that match the image features.

In a possible implementation manner, the acquisition part 301 is configured to determine whether there are two pairs of image features matching in the group of target images based on the image feature information of two two-dimensional points included in each group of two-dimensional points according to the following steps: two-dimensional points:

For each group of two-dimensional points, input the image feature information of two two-dimensional points of the group of two-dimensional points into the feature matching network, and determine whether the image feature information of the two two-dimensional points matches;

In the case of determining that the image feature information of two two-dimensional points matches, any group of two two-dimensional points whose image features match is determined as two two-dimensional points whose image features match exist in the group of target images.

In a possible implementation manner, the acquisition part 301 is configured to input the image feature information of two two-dimensional points of the group of two-dimensional points into the feature matching network according to the following steps, and determine the images of the two two-dimensional points Whether the feature information matches:

For each two-dimensional point in the group of two-dimensional points, based on the image feature information of the two-dimensional point and the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, the The image feature information of the two-dimensional point is updated to obtain the updated image feature information;

Input the updated image feature information respectively corresponding to the two two-dimensional points into the feature matching network, and determine whether the image feature information of the two two-dimensional points matches.

For the first target image among the multiple target images, based on the image feature information extracted from the first target image and the image feature information extracted from other target images in the multiple target images except the first target image, the second The image feature information of multiple two-dimensional points in a target image is updated to obtain updated image feature information respectively corresponding to multiple two-dimensional points in the first target image;

Based on the updated image feature information respectively corresponding to the plurality of target images, pairs of two-dimensional points belonging to the same target object are determined.

In a possible implementation manner, the acquisition part 301 is configured to determine the paired two-dimensional points belonging to the same target object based on the image feature information respectively updated in multiple target images according to the following steps:

randomly selecting two target images from a plurality of target images, and selecting two corresponding two-dimensional points from the selected two target images;

Input the updated image feature information corresponding to the selected two two-dimensional points into the pre-trained feature matching network, and determine that the selected two two-dimensional points belong to the same Pairs of 2D points of the target object.

In a possible implementation manner, the acquisition part 301 is configured to update the image feature information of the two-dimensional point according to the following steps:

Based on the two-dimensional coordinate information of the two-dimensional point in the corresponding target image and the two-dimensional coordinate information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, determine the distance between the two-dimensional point and other two-dimensional points the epipolar distance;

Based on the image feature information of the two-dimensional point, the image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the epipolar distance, the image feature information of the two-dimensional point is updated to obtain the updated Image feature information.

For the description of the processing flow of each part in the device and the interaction flow between each part, reference may be made to the relevant description in the above method embodiment, and details are not described here again.

An embodiment of the present disclosure also provides an electronic device, as shown in FIG. 4 , which is a schematic structural diagram of the electronic device provided by the embodiment of the present disclosure, including: a processor 401 , a memory 402 , and a bus 403 . The memory 402 stores machine-readable instructions executable by the processor 401 (for example, execution instructions corresponding to the acquisition part 301 and the detection part 302 in the device in FIG. 3 ), and when the electronic device is running, the processor 401 and the memory 402 communicates through the bus 403, and when the machine-readable instructions are executed by the processor 401, the following processing is performed:

Acquiring target images obtained by shooting multiple target objects under multiple viewing angles, and 3D coordinate information of candidate 3D points of each of the multiple target objects determined based on the acquired target images;

For each target audience, perform the following steps:

Based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object; based on the candidate three-dimensional space corresponding to the target object and the target image, determine the three-dimensional coordinate information of the target three-dimensional point of the target object.

An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method for three-dimensional point detection described in the above-mentioned method embodiments are executed. . Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

An embodiment of the present disclosure also provides a computer program product, the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, the computer executes the method described in the above method embodiment. For the steps of the method for three-dimensional point detection, refer to the foregoing method embodiments.

Wherein, the above-mentioned computer program product may be realized by hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and the like.

Those skilled in the art can clearly understand that for the convenience and brevity of description, for the working process of the above-described system and device, reference may be made to the corresponding process in the foregoing method embodiments. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. The device embodiments described above are schematic. For example, the division of the units is a logical function division. In actual implementation, there may be another division method. For example, multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

If the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the essence of the technical solution of the present disclosure or the part that contributes to the related technology or the part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several The instructions are used to make an electronic device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.

The aforementioned computer-readable storage medium may be a tangible device capable of retaining and storing instructions used by an instruction execution device, and may be a volatile storage medium or a nonvolatile storage medium. A computer readable storage medium may be, for example but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the above. More specific examples (non-exhaustive list) of computer-readable storage media include: portable computer disk, hard disk, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), erasable Type programmable read-only memory (Erasable Programmable Read Only Memory, EPROM or flash memory), static random-access memory (Static Random-Access Memory, SRAM), portable compact disk read-only memory (Compact Disk Read Only Memory, CD-ROM) , Digital versatile discs (Digital versatile Disc, DVD), memory sticks, floppy disks, mechanically encoded devices, such as punched cards or raised structures in grooves with instructions stored thereon, and any suitable combination of the foregoing. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.

Finally, it should be noted that the above-mentioned embodiments are only specific implementation methods of the embodiments of the present disclosure, and are used to illustrate the technical solutions of the embodiments of the present disclosure, rather than limiting them, and the protection scope of the embodiments of the present disclosure is not limited Here, although the embodiments of the present disclosure have been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that any person familiar with the technical field can still understand The technical solutions described in the foregoing embodiments are modified or easily conceivable changes are made, or equivalent replacements are made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present disclosure. Both the spirit and scope should fall within the protection scope of the embodiments of the present disclosure. Therefore, the protection scope of the embodiments of the present disclosure should be based on the protection scope of the claims.

Industrial Applicability

The embodiment of the present disclosure acquires target images obtained by shooting multiple target objects under multiple viewing angles, and 3D coordinate information of candidate 3D points of each of the multiple target objects determined based on the acquired target images ; For each target object, perform the following steps: based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object; based on the candidate three-dimensional space corresponding to the target object, and The target image determines the three-dimensional coordinate information of the target three-dimensional point of the target object. In this way, using the projection relationship between the candidate 3D space where the candidate 3D point of the target object is located and the target images under multiple viewing angles, the 3D point of each target object can be accurately detected. At the same time, for the candidate 3D point in The projection operation in the candidate 3D space avoids the voxelization operation of the entire space, which will significantly improve the detection efficiency.

Claims

A method for three-dimensional point detection, the method comprising:

Acquiring target images obtained by shooting multiple target objects under multiple viewing angles, and three-dimensional coordinate information of candidate three-dimensional points of each of the multiple target objects determined based on the acquired target images;

For each target object, based on the three-dimensional coordinate information of the candidate three-dimensional point of the target object, determine the candidate three-dimensional space corresponding to the target object;

Based on the candidate three-dimensional space corresponding to the target object and the target image, three-dimensional coordinate information of a target three-dimensional point of the target object is determined.
The method according to claim 1, wherein the 3D point comprises a 3D center point; the candidate 3D point comprises a candidate 3D center point, and the candidate 3D center point of the target object is located at the center of the target object position; the target three-dimensional point includes the target three-dimensional center point.
The method according to claim 1 or 2, wherein the determining the three-dimensional coordinate information of the target three-dimensional point of the target object based on the candidate three-dimensional space corresponding to the target object and the target image comprises:

performing spatial sampling on the candidate three-dimensional space of the target object, and determining a plurality of sampling points;

For each sampling point in the plurality of sampling points, based on the three-dimensional coordinate information of the sampling point in the candidate three-dimensional space and the target image, determine a three-dimensional point detection result corresponding to the sampling point;

Based on the obtained 3D point detection result, determine the 3D coordinate information of the target 3D point of the target object.
The method according to claim 3, wherein the determining the 3D point detection result corresponding to the sampling point based on the 3D coordinate information of the sampling point in the candidate 3D space and the target image comprises:

For each sampling point in the plurality of sampling points, based on the corresponding relationship between the three-dimensional coordinate system of the candidate three-dimensional space and the two-dimensional coordinate system of each viewing angle, project the three-dimensional coordinate information to different viewing angles, determining the two-dimensional projection point information of the sampling points in the multiple target images;

Based on the two-dimensional projection point information of the sampling point in the multiple target images, determine the sampling point feature information of the sampling point under different viewing angles;

A three-dimensional point detection result corresponding to the sampling point is determined based on the sampling point feature information of the sampling point under different viewing angles.
The method according to claim 4, wherein the two-dimensional projection point information includes image position information of the two-dimensional projection point; the two-dimensional projection point information in a plurality of target images based on the sampling points respectively , determine the sampling point feature information of the sampling point under different viewing angles, including:

extracting image features respectively corresponding to a plurality of the target images;

For each of the plurality of target images, based on the image position information of the two-dimensional projection point of the sampling point in the plurality of target images, extracting from the image features corresponding to the target image image features corresponding to the image location information;

The extracted image feature corresponding to the image position information is used to determine the sampling point feature information of the sampling point under different viewing angles.
The method according to claim 4 or 5, wherein said determining the three-dimensional point detection result corresponding to the sampling point based on the sampling point feature information of the sampling point under different viewing angles comprises:

Based on the sampling point feature information of the sampling point under different viewing angles and the sampling point feature information of other sampling points, determine the updated sampling point feature information of the sampling point under different viewing angles; the other sampling points are related to the The sampling point associated with the sampling point;

Based on the updated sampling point feature information corresponding to the sampling point, a three-dimensional point detection result corresponding to the sampling point is determined.
The method according to any one of claims 1 to 6, wherein determining the three-dimensional coordinate information of the candidate three-dimensional points of each target object comprises:

extracting image feature information of a plurality of two-dimensional points from the plurality of target images, wherein each of the plurality of two-dimensional points is a pixel located in a corresponding target object;

Determining paired two-dimensional points belonging to the same target object based on the image feature information respectively extracted from a plurality of the target images, wherein the paired two-dimensional points are from different target images;

According to the determined two-dimensional coordinate information of the paired two-dimensional points in the respective target images, the three-dimensional coordinate information of the candidate three-dimensional points of the same target object is determined.
The method according to claim 7, wherein said determining paired two-dimensional points belonging to the same target object based on image feature information respectively extracted from a plurality of said target images comprises:

Combining the target images in pairs to obtain at least one set of target images;

Based on the image feature information of the two-dimensional points in the multiple target images, determine whether there are two two-dimensional points with matching image features in each group of the target images in the at least one set of target images; The two-dimensional points respectively belong to different target images in the same group of target images;

If it is determined that there are two two-dimensional points with matching image features in each group of target images, the two two-dimensional points with matching image features are determined as the paired two-dimensional points belonging to the same target object.
The method according to claim 8, wherein, based on the image feature information of two-dimensional points in a plurality of the target images, it is determined whether there is an image in each group of the target images in the at least one set of target images Two 2D points for feature matching, including:

For the target image group in the at least one group of target images, combining the two-dimensional points in the two target images of the target image group to obtain multiple sets of two-dimensional points; based on the multiple sets of two-dimensional points The image feature information of the two 2D points included in the 2D point group, and determine whether there are two 2D points with matching image features in the target image group.
The method according to claim 9, wherein, based on the image feature information of two two-dimensional points included in each group of two-dimensional points in the multiple groups of two-dimensional points, it is determined whether there is an image feature in the target image group Two 2D points to match, including:

For the two-dimensional point group in the plurality of groups of two-dimensional points, the image feature information of the two two-dimensional points of the two-dimensional point group is input into the feature matching network, and the image features of the two two-dimensional points are determined Whether the information matches;

When it is determined that the image feature information of the two two-dimensional points matches, the two two-dimensional points whose image features match are determined as the two two-dimensional points whose image features match exist in the target image group.
The method according to claim 10, wherein said inputting the image feature information of two two-dimensional points of said two-dimensional point group into a feature matching network, determining whether the image feature information of said two two-dimensional points matches, including:

For each two-dimensional point in the two-dimensional point group, based on the image feature information of the two-dimensional point and the image feature information of other two-dimensional points in other target images, the image feature information of the two-dimensional point performing an update to obtain updated image feature information; the other target image is a target image different from the target image where the two-dimensional point is located;

Inputting the updated image feature information respectively corresponding to the two two-dimensional points into the feature matching network, and determining whether the image feature information of the two two-dimensional points matches.
The method according to claim 7, wherein said determining paired two-dimensional points belonging to the same target object based on image feature information respectively extracted from a plurality of said target images comprises:

For the first target image among the multiple target images, based on the image feature information extracted from the first target image and the image feature information extracted from other target images, the multiple target images in the first target image The image feature information of the two-dimensional points is updated to obtain the updated image feature information respectively corresponding to a plurality of the two-dimensional points in the first target image; the other target images are all but one of the multiple target images an object image other than the first object image;

Based on the updated image feature information respectively corresponding to the plurality of target images, pairs of two-dimensional points belonging to the same target object are determined.
The method according to claim 12, wherein said determining paired two-dimensional points belonging to the same target object based on the updated image feature information respectively corresponding to a plurality of said target images comprises:

randomly selecting two target images from a plurality of the target images, and selecting corresponding two two-dimensional points from the arbitrarily selected two target images;

Input the updated image feature information corresponding to the selected two two-dimensional points into the pre-trained feature matching network, and determine the selected two two-dimensional points when the network output feature matching is successful. Points are pairs of 2D points belonging to the same target object.
The method according to any one of claims 11 to 13, wherein updating the image feature information of the two-dimensional point includes:

Based on the two-dimensional coordinate information of the two-dimensional point in the corresponding target image and the two-dimensional coordinate information of other two-dimensional points in other target images, determine the epipolar distance between the two-dimensional point and the other two-dimensional point ; The other target image is a target image different from the target image where the two-dimensional point is located;

Based on the image feature information of the two-dimensional point, image feature information of other two-dimensional points in other target images different from the target image where the two-dimensional point is located, and the epipolar distance to the image feature information of the two-dimensional point Perform an update to obtain updated image feature information.
A device for three-dimensional point detection, the device comprising:

The acquiring part is configured to acquire a target image obtained by shooting multiple target objects under multiple viewing angles, and a 3D position of a candidate 3D point of each of the multiple target objects determined based on the acquired target images. coordinate information;

The detection part is configured to, for each target object, determine a candidate three-dimensional space corresponding to the target object based on three-dimensional coordinate information of a candidate three-dimensional point of the target object; based on the candidate three-dimensional space corresponding to the target object, As well as the target image, determine the three-dimensional coordinate information of the target three-dimensional point of the target object.
An electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the memory are connected through Bus communication, when the machine-readable instructions are executed by the processor, the steps of the three-dimensional point detection method according to any one of claims 1 to 14 are executed.
A computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the three-dimensional point detection method according to any one of claims 1 to 14 are executed.
A computer program product, the computer program product comprising a computer program or instruction, when the computer program or instruction is run on a computer, the computer is made to execute the three-dimensional point as claimed in any one of claims 1 to 14 The steps of the detection method.