WO2023016082A1 - Procédé et appareil de reconstruction tridimensionnelle et dispositif électronique et support de stockage - Google Patents

Procédé et appareil de reconstruction tridimensionnelle et dispositif électronique et support de stockage Download PDF

Info

Publication number
WO2023016082A1
WO2023016082A1 PCT/CN2022/098993 CN2022098993W WO2023016082A1 WO 2023016082 A1 WO2023016082 A1 WO 2023016082A1 CN 2022098993 W CN2022098993 W CN 2022098993W WO 2023016082 A1 WO2023016082 A1 WO 2023016082A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
vehicle
model
target
group
Prior art date
Application number
PCT/CN2022/098993
Other languages
English (en)
Chinese (zh)
Inventor
张保成
Original Assignee
北京迈格威科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京迈格威科技有限公司 filed Critical 北京迈格威科技有限公司
Publication of WO2023016082A1 publication Critical patent/WO2023016082A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/06Topological mapping of higher dimensional structures onto lower dimensional surfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to a three-dimensional reconstruction method, device, electronic equipment, and storage medium.
  • the purpose of the embodiments of the present disclosure is to provide a three-dimensional reconstruction method, device, electronic equipment and storage medium to solve the above problems.
  • an embodiment of the present disclosure provides a three-dimensional reconstruction method, the method comprising: acquiring a plurality of monitoring image data; each of the plurality of monitoring image data includes image data of a vehicle; according to the plurality of monitoring image data, Determine the target image group of the target model vehicle; the target image group includes: images of the target model vehicle at different viewing angles; obtain the calibration results of the cameras corresponding to each image in the target image group; according to the The target image group and the calibration results of the cameras corresponding to the images in the target image group obtain the three-dimensional model of the vehicle of the target model.
  • the images of the vehicle of the target model under different viewing angles are determined, and then the three-dimensional model of the vehicle of the target model is obtained according to the calibration results of each image and the corresponding camera.
  • Manual hand-held 3D scanning equipment scans the vehicle, and the construction of the vehicle 3D model can be realized based on the monitoring image data, which is more efficient; secondly, since there is no need for 3D scanning equipment and physical vehicles, the cost is lower and the implementation is easier.
  • the determining the target image group of the vehicle of the target model according to the plurality of monitoring image data includes: detecting the vehicle in the plurality of monitoring image data , to obtain a plurality of vehicle images; group the plurality of vehicle images according to the model of the vehicle to obtain at least one image group corresponding to at least one model of the vehicle, each image group includes the corresponding model of the vehicle under different viewing angles a plurality of vehicle images; determining an image group from the at least one image group as the target image group.
  • multiple vehicle images are obtained by detecting vehicles in multiple monitoring image data, and then multiple vehicle images are grouped according to vehicle models to prevent subsequent use of A-type vehicle images from Model B vehicles perform 3D reconstruction to improve the accuracy of 3D reconstruction; secondly, when grouping, because only the images of each vehicle need to be processed, there is no need to process the images of other objects in the surveillance video, and avoid the distortion of other objects.
  • the image interferes with the grouping, thereby reducing the complexity of the grouping, and improving the grouping efficiency and grouping accuracy.
  • the method before determining an image group from the at least one image group as the target image group, the method further includes: for each of the at least one image group An image group, and the vehicle images belonging to the same viewing angle in the image group are deduplicated.
  • the vehicle images belonging to the same viewing angle in the image group are deduplicated, so as to reduce the complexity of 3D reconstruction using the image group and improve the efficiency of 3D reconstruction.
  • obtaining the 3D model of the vehicle of the target model includes: obtaining the The key point information group corresponding to each image in the target image group; the key point information group includes: the position of a plurality of two-dimensional key points representing the outline of the vehicle in the corresponding image in the image; according to the target image group The key point information group corresponding to each image in the image and the calibration result of the camera that captured each image are used to obtain the 3D model of the vehicle of the target model.
  • the key point information group includes: the position of multiple two-dimensional key points representing the outline of the vehicle in the corresponding image in the image; and then according to each of the target image group
  • the key point information group corresponding to the image and the calibration result of the camera that took each image can obtain the 3D model of the vehicle of the target model, without using the position information of all points in the target image group in the image, and then improve the efficiency of 3D reconstruction.
  • the 3D model of the vehicle of the target model is obtained according to the key point information group corresponding to each image in the target image group and the calibration result of the camera that captures each image , comprising: determining the initial three-dimensional model of the vehicle of the target model, the initial three-dimensional model comprising: each three-dimensional key point constituting the three-dimensional model, and the initial coordinates of each three-dimensional key point in the model coordinate system; for the target image For each image in the group, determine the initial pose of the vehicle in the image in the world coordinate system when the image is taken; according to the key point information group corresponding to each image in the target image group and capture the respective images Based on the calibration result of the camera, the initial pose of the vehicle in each image and the initial coordinates of the three-dimensional key points of the initial three-dimensional model are optimized by using the bundle adjustment method to obtain the three-dimensional model of the vehicle of the target model.
  • the camera calibration results of each image optimize the initial pose of the vehicle in each image and the initial coordinates of the 3D key points of the initial 3D model to reduce the impact of noise on the 3D reconstruction results and improve the accuracy of the 3D reconstruction.
  • the camera calibration result corresponding to the image, and the initial pose determine the position of the initial projection point corresponding to the image in the image coordinate system, and then according to the initial projection point and the position difference between the corresponding two-dimensional key points, determine the first loss value corresponding to the image, and according to the first loss value of each image, the initial coordinates of the three-dimensional key points of the initial three-dimensional model and corresponding to each image.
  • the initial pose is optimized until the new loss value determined by using the optimized 3D model and the optimized pose meets the preset conditions, then the optimization is stopped, and then the accuracy of the final 3D model can be guaranteed.
  • obtaining the calibration result of the camera corresponding to each image in the target image group includes: determining each image in the target image group according to the plurality of surveillance image data The calibration results of the corresponding cameras.
  • the monitoring image data is used to determine the calibration result of the camera, so as to ensure that the subsequent three-dimensional reconstruction of the vehicle of the target model can be performed according to the calibration result of the camera.
  • an embodiment of the present disclosure provides a three-dimensional reconstruction device, the device comprising: an acquisition unit configured to acquire a plurality of monitoring image data; the plurality of monitoring image data all include image data of a vehicle; An image group determining unit, configured to determine a target image group of a target model vehicle according to the plurality of monitoring image data; the target image group includes: images of the target model vehicle under different viewing angles; calibration result acquisition A unit configured to obtain the calibration result of the camera corresponding to each image in the target image group; a three-dimensional model obtaining unit configured to obtain the calibration result of the camera corresponding to each image in the target image group and the target image group A 3D model of the vehicle of the target model.
  • the image group determination unit includes: a detection unit configured to detect vehicles in the plurality of monitoring image data to obtain a plurality of vehicle images
  • the grouping unit may be configured to group the plurality of vehicle images according to the model of the vehicle to obtain at least one image group corresponding to at least one model of the vehicle, each image group including the vehicle of the corresponding model in A plurality of vehicle images under different viewing angles;
  • the selecting unit may be configured to determine one image group from the at least one image group as the target image group.
  • the device further includes: a deduplication unit configured to, for each image group in the at least one image group, The image of the vehicle under the perspective is deduplicated.
  • the 3D model obtaining unit includes: an information group obtaining unit configured to obtain the key point information group corresponding to each image in the target image group;
  • the key point information group includes: the position of multiple two-dimensional key points representing the outline of the vehicle in the corresponding image in the image;
  • the three-dimensional model obtaining subunit can be configured to The corresponding key point information group and the calibration results of the cameras that capture the images are used to obtain the three-dimensional model of the vehicle of the target model.
  • the 3D model obtaining subunit includes: an initial model determining unit configured to determine the initial 3D model of the vehicle of the target model, the initial 3D model
  • the model includes: each 3D key point constituting the 3D model, and the initial coordinates of each 3D key point in the model coordinate system;
  • the initial pose determination unit may be configured to, for each image in the target image group, Determining the initial pose of the vehicle in the image in the world coordinate system when the image is captured;
  • the optimization unit may be configured to set and capture the key point information corresponding to each image in the target image group
  • the camera calibration results of each image are optimized for the initial pose of the vehicle in each image and the initial coordinates of the 3D key points of the initial 3D model to obtain the 3D model of the vehicle of the target model.
  • the optimization unit includes: a projection unit configured to, for each image in the target image group, according to the initial three-dimensional model, the image Corresponding to the camera calibration result and the initial pose, determine the position of the initial projection point corresponding to the image in the image coordinate system; the initial projection point includes the three-dimensional corresponding to the two-dimensional key point of the image in the initial three-dimensional model The key point is projected to a point in the image coordinate system under the initial pose corresponding to the image; the loss determination unit may be configured to be based on the relationship between each initial projected point corresponding to the image and the corresponding two-dimensional key point Determine the first loss value corresponding to the image according to the position difference between them; the optimization subunit may be configured to calculate the initial coordinates and Optimizing the initial poses corresponding to each image until a new loss value determined using the optimized three-dimensional model and the optimized pose meets the preset condition; the optimized three-dimensional model is the vehicle of the target model 3D model of .
  • the calibration result acquisition unit may be configured to determine the calibration of the camera corresponding to each image in the target image group according to the plurality of surveillance image data result.
  • an embodiment of the present disclosure provides an electronic device, including a processor and a memory connected to the processor, where a computer program is stored in the memory, and when the computer program is executed by the processor, the The electronic device executes the method described in the first aspect.
  • an embodiment of the present disclosure provides a storage medium, where a computer program is stored in the storage medium, and when the computer program runs on a computer, the computer executes the method described in the first aspect.
  • FIG. 1 is a schematic flowchart of a three-dimensional reconstruction method provided by an embodiment of the present disclosure.
  • FIG. 2 is a schematic structural diagram of a three-dimensional reconstruction device provided by an embodiment of the present disclosure.
  • FIG. 3 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • Icons 200-three-dimensional reconstruction device; 210-acquisition unit; 220-image group determination unit; 230-calibration result acquisition unit; 240-three-dimensional model acquisition unit; 300-electronic equipment; 301-processor; 302-memory; 303- Communication Interface.
  • FIG. 1 is a flow chart of a three-dimensional reconstruction method provided by an embodiment of the present disclosure. The process shown in FIG. 1 will be described in detail below, and the method includes steps: S11-S14.
  • S11 Acquire a plurality of monitoring image data; the plurality of monitoring image data all include image data of a vehicle.
  • S12 Determine a target image group of the target model vehicle according to the plurality of monitoring image data; the target image group includes: images of the target model vehicle under different viewing angles.
  • S11 Acquire a plurality of monitoring image data; the plurality of monitoring image data all include image data of a vehicle.
  • S11 may be implemented in the following manner, acquiring a plurality of monitoring image data captured by at least one camera from a third party, wherein the plurality of monitoring image data all include vehicle image data.
  • multiple monitoring image data can be acquired without directly communicating with at least one camera, and the complexity of communication connection is low, especially when the number of at least one camera is large, the effect is more obvious.
  • S11 can be implemented in the following manner, acquiring multiple monitoring image data captured by at least one camera within a specified time period after the current moment, so as to ensure that the acquired multiple monitoring image data contains the latest vehicle model Image.
  • the latest vehicle model may be a model of a vehicle produced within one year before the current moment, and in other embodiments, the latest vehicle model may also be a model of a vehicle produced within half a year before the current moment.
  • S11 may be implemented in the following manner, acquiring a plurality of monitoring image data sent by at least one camera.
  • the image may be a picture or a video.
  • each camera in the at least two cameras can be set at different geographical locations or different angles respectively, so as to ensure that images from different angles can be taken; in the above-mentioned
  • the camera may be a camera that is installed on roads inside the park, parking lots inside the company, etc., and can capture images of vehicles.
  • S12 Determine a target image group of the target model vehicle according to the plurality of monitoring image data; the target image group includes: images of the target model vehicle under different viewing angles.
  • the target model is the model of the vehicle whose three-dimensional model needs to be established.
  • S12 can be implemented in the following manner, based on a predetermined target model, using a pre-trained vehicle model recognition model to identify a plurality of images of vehicles belonging to the target model from a plurality of monitoring image data, And divide the multiple images of the vehicle of the target model into the target image group.
  • the target image group needs to contain images of the vehicle under multiple viewing angles.
  • the target image group needs to include images of the front surface of the target model vehicle, including the target model vehicle
  • the target image group may not include images under the viewing angles corresponding to this part of the viewing angles. For example, if the target image group already includes the image of the left side of the vehicle, the image of the right side of the vehicle may not be included.
  • the specific implementation manner of identifying which surface image of the vehicle the image belongs to may be: according to the image of the vehicle, determine the information of a plurality of two-dimensional key points that characterize the outline of the vehicle in the image, and then According to the information of the two-dimensional key points, it can be determined which surface image of the vehicle the image belongs to. For example, according to the image of the vehicle, it is determined that a plurality of two-dimensional key points representing the outline of the vehicle in the image include the key points of the left window, the key points of the left front wheel, the key points of the left rear wheel, etc., then it can be determined that the image is Left view.
  • the origin of the image coordinate system corresponding to the image can be the center of the image or the vertex of the image, one of the u axis and the v axis of the image coordinate system is parallel to the upper edge of the image, and the image coordinate The v-axis of the system, the other of the v-axis is parallel to the lower edge of the image.
  • S12 can be implemented in the following manner. For the image of each vehicle in a plurality of surveillance video data, input the image of the vehicle into the pre-trained vehicle model recognition model to obtain the vehicle model, and use the same The images captured by the vehicle model under different viewing angles are divided into an image group, and then from at least one of the divided image groups, one image group is sequentially or randomly determined as the target image group; or, considering that in order to construct the target model For the three-dimensional model of the vehicle, it is necessary to ensure that the target image group includes images of each surface of the target model vehicle.
  • the number of images meets a certain requirement (for example, greater than or equal to 200) or
  • the image group required by the angle distribution (eg, distribution at certain specific angles) is the target image group. It can be understood that, because the larger the number of images in the target image group, the more ensured that the images of the various surfaces of the vehicle of the target model can be determined from the target image group.
  • S12 includes steps: A1-A3.
  • A1 Detecting vehicles in the multiple monitoring image data to obtain multiple vehicle images.
  • A1 can be implemented in the following manner. For each image in multiple monitoring image data, use a pre-trained vehicle detection model to detect the vehicle in the image to obtain the position of the vehicle detection frame. The vehicle image can be cropped according to the position of the vehicle detection frame.
  • step A2 is performed.
  • A2 Group the vehicle images of each vehicle according to the vehicle model to obtain at least one more image group corresponding to at least one model of vehicle, and each image group includes images of the corresponding model of the vehicle under different viewing angles.
  • A2 can be implemented in the following manner. For each vehicle image, use the pre-trained vehicle brand recognition model to identify the vehicle brand in the vehicle image, wherein the vehicle image can include : The vehicle logo or logo of the vehicle; after determining the brand of the vehicle in the vehicle image, use the pre-trained vehicle model detection model corresponding to the brand of the vehicle to group the images of vehicles belonging to the same brand , dividing vehicle images of vehicles of the same model under various viewing angles into an image group to obtain at least one image group corresponding to at least one type of vehicle one-to-one.
  • the number of images in the image group needs to meet a certain Requirements (for example, the number of images in the group is greater than or equal to 200, or 3000), to ensure that the group of images includes vehicle images of vehicles of the corresponding model at various angles of view, and then ensure that the three-dimensionality of the vehicle can be accurately determined in the future Model.
  • a certain Requirements for example, the number of images in the group is greater than or equal to 200, or 3000
  • A2 can be implemented in the following manner. For each vehicle image, input the vehicle image into a pre-trained vehicle model recognition model to obtain the vehicle model, and then divide the vehicle images of the same model into As an image group, at least one image group corresponding to at least one model of vehicle is obtained.
  • step A3 may be performed.
  • A3 Determine an image group from the at least one image group as the target image group.
  • A3 may be implemented in the following manner, one image group is sequentially determined from the at least one image group as the target image group.
  • A3 may be implemented in the following manner, randomly determining an image group from the at least one image group as the target image group.
  • A3 may be implemented in the following manner, using an image group whose number of images satisfies the requirement as a target image group.
  • A3 may be implemented in the following manner. First, the target model is determined, and then the image group corresponding to the target model is used as the target image group.
  • each image group in the remaining image groups can be sequentially As the target image group, S13-S14 is then executed using the image group as the target image group, so as to determine the three-dimensional models of vehicles of multiple models.
  • step S13 is executed.
  • the camera calibration result may include the internal reference matrix K of the camera, the external reference matrix P of the camera, or the matrix S obtained by multiplying the internal reference matrix K of the camera and the external reference matrix P of the camera.
  • the specific calibration method may use any camera calibration method, which is not limited here.
  • the third category is based on the target calibration captured by the camera.
  • a specific method of camera calibration based on the target captured by the camera is as follows:
  • the 3D model of the known target (the 3D model of the known target can be reconstructed by methods such as manual scanning, and the 3D model includes multiple 3D key points); obtain the pose of the known target in the world coordinate system (so it can be determined The position of multiple 3D key points in the world coordinate system and the attitude of the 3D model in the world coordinate system); use the camera to be calibrated to shoot the known target to obtain the calibration image; key the known target in the calibration image Point detection, to obtain each two-dimensional key point of the known target and the position (u, v) of the two-dimensional key point in the image coordinate system; according to the corresponding relationship between the two-dimensional key point and the three-dimensional key point in the three-dimensional model (for For any 2D key point of the target in any calibration image, it is bound to be able to find the 3D key point corresponding to the position of the 2D key point from the 3D model of the target.
  • any 3D key point in the 3D model of the target key point there may not be a two-dimensional key point corresponding to the position of the three-dimensional key point in the calibration image;
  • the 3D key point corresponding to the 2D key point in the upper left corner of the front door is found, and the corresponding 3D key point is the point in the upper corner of the front door in the 3D model of the vehicle), and the known target can be determined
  • the position (x m , y m , z m ) of the 3D key point corresponding to the 2D key point in the 3D model of the The position (u, v) of the 2D key point in the image coordinate system, the position (x m , y m , z m ) of the 3D key point corresponding to the 2D key point in the 3D model in the world coordinate system, and the position of the 3D key point in the world coordinate system
  • the pose of the model in the world coordinate system when the calibration image is taken (the position is represented by (X
  • is a constant
  • the xoy plane of the world coordinate system may overlap with the road where the vehicle is located.
  • the origin of the world coordinate system may be a projection point corresponding to the center of the camera on the road where the vehicle is located.
  • S13 includes: determining a calibration result of a camera corresponding to each image in the target image group according to the plurality of surveillance image data.
  • the multiple surveillance image data include multiple models of vehicles, some of which have unknown 3D models of vehicles, and some of which have known 3D models of vehicles. Therefore, automatic calibration of the camera can be performed based on the vehicle whose 3D model is known.
  • S13 can be implemented in the following manner: use the cameras corresponding to each image in the target image group as the camera to be calibrated sequentially; find out the monitoring image data captured by the camera to be calibrated from a plurality of monitoring image data, From the monitoring image data captured by the camera to be calibrated, it is determined that there are multiple images of a target with a known 3D model (for example, if the 3D model of a certain type of vehicle is known, then the target of the known 3D model is the vehicle of this type) , determine the coordinates of the 3D key points representing the target according to the known 3D model, and detect the coordinates of the 2D key points representing the target in the multiple images according to the multiple images captured by the camera, and then according to the target’s
  • the three-dimensional coordinates of each three-dimensional key point and the two-dimensional coordinates of the corresponding two-dimensional key point determine the calibration result of the camera to be calibrated.
  • the corresponding relationship between the camera identification and the camera calibration result is stored.
  • S13 can be implemented in the following manner. Obtain the identifier of the camera corresponding to each image in the target image group, and then, for the identifier of the camera corresponding to each image, obtain the corresponding relationship between the camera identifier and the camera calibration result stored in advance. In , find out the camera calibration result corresponding to the identity of the camera corresponding to the image.
  • the three-dimensional model of the vehicle of the target model includes: three-dimensional key points representing the profile of the vehicle of the target model, and the relative positional relationship of each three-dimensional key point (for example, a certain three-dimensional key point can be the origin, parallel to the plane where the chassis of the vehicle is located)
  • the plane is the coordinate plane to establish a model coordinate system, and the coordinate values of each three-dimensional key point in the model coordinate system can represent the relative positional relationship between each three-dimensional key point); the three-dimensional key points contained in the three-dimensional model of each type of vehicle
  • the type and quantity of the model can be consistent, and the 3D key points involved in each 3D model can include: each car light, each window, each door, each wheel, the front surface of the car, the rear surface of the car, the roof surface, etc. point and so on.
  • the model coordinate system can be a three-dimensional coordinate system, and the model coordinate system and the world coordinate system can only have a translation relationship without a rotation or scaling relationship;
  • the origin of the model coordinate system can be a vehicle in an image in the target image group
  • the z-axis of the model coordinate system can be perpendicular to the road where the vehicle is located.
  • the x-axis of the model coordinate system can be parallel to the central axis of the vehicle. axis is perpendicular to the z-axis.
  • the images of the vehicle of the target model under different viewing angles are determined, and then the three-dimensional model of the vehicle of the target model is obtained according to the calibration results of each image and the corresponding camera.
  • Manual hand-held 3D scanning equipment scans the vehicle, and the construction of the vehicle 3D model can be realized based on the monitoring image data, which is more efficient; secondly, since there is no need for 3D scanning equipment and physical vehicles, the cost is lower and the implementation is easier.
  • S14 includes steps: B1-B2.
  • the key point information group includes: the positions of multiple two-dimensional key points representing the outline of the vehicle in the corresponding image in the image.
  • B1 can directly acquire multiple two-dimensional key points that characterize the outline of the vehicle in the image determined in the aforementioned step S12 for each image in the target image in the following manner: position; wherein, the positions of multiple two-dimensional key points in the image constitute the key point information group of the image. It can be understood that in some vehicle detection or vehicle model detection models, the vehicle key point detection is performed while the vehicle detection or vehicle model detection is performed, so that in step S12, multiple two-dimensional key points have been determined in the image position in .
  • B1 can be implemented in the following manner. For each image in the target image group, use the pre-trained vehicle key point extraction model to extract the key points of the image, so as to determine the representative vehicle from the image The positions of multiple 2D keypoints of the contour of the image in the image. It can be understood that in some vehicle detection or vehicle model detection models, vehicle key point detection is performed without vehicle detection or vehicle model detection, so additional steps are required to determine the positions of multiple two-dimensional key points in the image.
  • B2 Obtain the 3D model of the vehicle of the target model according to the key point information group corresponding to each image in the target image group and the calibration result of the camera that captures each image.
  • the key point information group corresponding to each image in the target image group includes: the positions of multiple two-dimensional key points representing the outline of the vehicle in the corresponding image in the image; then according to each of the target image group
  • the key point information group corresponding to the image and the calibration result of the camera that captures each image can obtain the 3D model of the vehicle of the target model.
  • the three-dimensional reconstruction of the car models contained in the images can be automatically performed through a large number of multi-angle images, and then the efficiency and accuracy of the three-dimensional reconstruction can be improved.
  • B2 includes steps B21-B23.
  • B21 Determine the initial three-dimensional model of the vehicle of the target model, the initial three-dimensional model includes: each three-dimensional key point constituting the three-dimensional model, and the initial coordinates of each three-dimensional key point in the model coordinate system.
  • the model coordinate system is a three-dimensional coordinate system, which is established according to user requirements without limitation.
  • 3D key points contained in the initial 3D models of vehicles of various models may be consistent, and the 3D key points involved in each 3D model may include: each lamp, each door, each window, each wheel, vehicle Points on the periphery of the front surface, rear surface, roof surface, etc.
  • a 3D model of a vehicle with a known model can be used as the initial 3D model, or a unified initial 3D model can be specified for a certain type of vehicle.
  • the initial 3D models of vehicles belonging to the same class can be the same by default, for example, cars belong to the same class, trucks belong to the same class, and non-motor vehicles belong to the same class.
  • B22 For each image in the target image group, determine the initial pose of the vehicle in the image in the world coordinate system when the image is captured.
  • the initial pose can be determined according to the two-dimensional key points detected in each image combined with camera calibration information, or can be determined according to the pose of the same vehicle in the image captured at the associated time, and can also be a default value determined based on experience.
  • the two-dimensional key points detected in the first image are key points on the left rear wheel, left window, left front glass, left rear glass, etc.
  • the camera can be determined according to the camera calibration results of the camera that captured the first image. is erected parallel to the road, an initial pose can be estimated.
  • the second image also includes vehicle A
  • the second image is captured by the same camera as the first image, and the shooting time interval is less than a certain length of time (for example, 3s)
  • the first The pose of vehicle A in the image is estimated for the pose of the vehicle in the second image, for example, the pose of vehicle A in the first image can be used as the initial pose of vehicle A in the second image.
  • the initial pose of the vehicle in the world coordinate system when the image was taken in each image can be set to the same default value.
  • the method for optimizing the initial pose of the vehicle in each image and the initial coordinates of the 3D key points of the initial 3D model may be bundle adjustment.
  • B23 includes steps: B231-B233.
  • the initial projection point includes a point where a 3D key point corresponding to a 2D key point of the image in the initial 3D model is projected into the image coordinate system at an initial pose corresponding to the image.
  • the initial pose of the vehicle is the initial pose of the 3D model corresponding to the vehicle in the world coordinate system.
  • B231 can be implemented in the following manner. For each image, according to the corresponding two-dimensional key point information group of the image, from the initial three-dimensional model, determine the corresponding two-dimensional key points in the image The initial coordinates of the three-dimensional key points, for each determined three-dimensional key point, the initial coordinates (x, y, z) of the three-dimensional key point in the model, the camera calibration result S and the initial pose T corresponding to the image, Input to projection expression In , the position (u', v') of the initial projection point corresponding to the 3D key point in the image coordinate system corresponding to the image is obtained.
  • B232 Determine the first loss value corresponding to the image for the position difference between each initial projection point corresponding to the image and the corresponding dimensional key point.
  • B232 can be implemented in the following manner, for each initial projection point corresponding to the image, determine the position (u', v') of the initial projection point in the image coordinate system corresponding to the image and the The distance between the positions (u, v) of the two-dimensional key points corresponding to the three-dimensional key points corresponding to the initial projection point is determined as the first loss value as the sum of the respective distances corresponding to the image.
  • the optimized three-dimensional model is the three-dimensional model of the vehicle of the target model.
  • the initial coordinates of the 3D key points of the initial 3D model and the initial pose corresponding to each image are optimized, which can be based on the sum of the first loss values corresponding to each image, for The initial coordinates of the 3D key points of the initial 3D model and the initial poses corresponding to each image are optimized.
  • the preset condition can be one of the following: the sum of the loss values corresponding to each image converges; the sum of the loss values corresponding to each image is the minimum value in previous iterations; the loss value corresponding to each image is less than the target loss value; the corresponding loss value of each image The sum of the loss values of is less than the preset value; the number of iterations reaches the preset number.
  • B233 can be implemented in the following manner.
  • the initial coordinates of the 3D key points of the initial 3D model and the initial poses corresponding to each image The parameters are optimized to obtain the optimized 3D model and the optimized pose; for each image in the target image group, according to the corresponding 2D key point information group of the image, from the optimized 3D model, determine Get the coordinates of the three-dimensional key points corresponding to the two-dimensional key points in the image, and for each three-dimensional key point, the coordinates (x, y, z) of the three-dimensional key points, the camera calibration result S corresponding to the image and
  • the optimized pose T is input to the projection expression , the position (u', v') of the optimized projection point corresponding to the 3D key point in the image coordinate system corresponding to the image is obtained, according to each optimized projection point corresponding to the image and the corresponding 2D
  • the camera calibration result corresponding to the image, and the initial pose determine the position of the initial projection point corresponding to the image in the image coordinate system, and then according to the initial projection point and the position difference between the corresponding two-dimensional key points, determine the first loss value corresponding to the image, and according to the first loss value of each image, the initial coordinates and initial pose of the three-dimensional key points of the initial three-dimensional model Optimization, until the new loss value determined by using the optimized 3D model and the optimized pose meets the preset condition, the optimization is stopped, and then the accuracy of the final 3D model can be guaranteed.
  • the method further includes: for each image group in the at least one image group, deduplicating the vehicle images belonging to the same viewing angle in the image group.
  • the preset threshold may be any value in 79%-90%.
  • the vehicle images belonging to the same viewing angle in the image group are deduplicated, so as to reduce the complexity of 3D reconstruction using the image group and improve the efficiency of 3D reconstruction.
  • FIG. 2 is a structural block diagram of a three-dimensional reconstruction apparatus 200 provided by an embodiment of the present disclosure.
  • the structural block diagram shown in Figure 2 will be described below, and the shown devices include:
  • the obtaining unit 210 may be configured to obtain a plurality of monitoring image data; the plurality of monitoring image data all include image data of a vehicle.
  • the image group determining unit 220 may be configured to determine the target image group of the target model vehicle according to the plurality of surveillance image data; the target image group includes: the target model vehicle under different viewing angles image.
  • the calibration result obtaining unit 230 may be configured to obtain a calibration result of the camera corresponding to each image in the target image group.
  • the 3D model obtaining unit 240 may be configured to obtain the 3D model of the vehicle of the target model according to the target image group and the calibration results of the cameras corresponding to the images in the target image group.
  • the image group determination unit 220 includes: a detection unit configured to detect vehicles in the plurality of monitoring image data to obtain a plurality of vehicle images; a grouping unit may be It is configured to group the plurality of vehicle images according to the model of the vehicle to obtain at least one image group corresponding to at least one model of the vehicle, and each image group includes a plurality of images of the corresponding model of the vehicle under different viewing angles.
  • the vehicle image; selecting unit may be configured to determine one image group from the at least one image group as the target image group.
  • the device further includes: a deduplication unit, which may be configured to, for each image group in the at least one image group, deduplicate the vehicle images belonging to the same viewing angle in the image group. Heavy.
  • a deduplication unit which may be configured to, for each image group in the at least one image group, deduplicate the vehicle images belonging to the same viewing angle in the image group. Heavy.
  • the 3D model obtaining unit 240 includes: an information group obtaining unit configured to obtain a key point information group corresponding to each image in the target image group; the key point information group includes : The position of multiple two-dimensional key points representing the outline of the vehicle in the corresponding image in the image; the three-dimensional model obtaining subunit can be configured to set the key point information corresponding to each image in the target image group and the calibration results of the cameras that capture the respective images to obtain a three-dimensional model of the vehicle of the target model.
  • the 3D model obtaining subunit includes: an initial model determining unit configured to determine the initial 3D model of the vehicle of the target model, and the initial 3D model includes: Each three-dimensional key point, and the initial coordinates of each three-dimensional key point in the model coordinate system;
  • the initial pose determination unit may be configured to determine the position of the vehicle in the image for each image in the target image group The initial pose of the image in the world coordinate system when the image is taken;
  • the optimization unit may be configured to be based on the key point information group corresponding to each image in the target image group and the calibration result of the camera that captures each image , optimizing the initial pose of the vehicle in each image and the initial coordinates of the 3D key points of the initial 3D model to obtain the 3D model of the vehicle of the target model.
  • the optimization unit includes: a projection unit, configured to, for each image in the target image group, according to the initial 3D model, the camera calibration result corresponding to the image, and the initial Pose, determine the position of the initial projection point corresponding to the image in the image coordinate system;
  • the initial projection point includes the three-dimensional key point corresponding to the two-dimensional key point of the image in the initial three-dimensional model in the corresponding position of the image Points projected into the image coordinate system under the initial pose;
  • the loss determination unit may be configured to determine the corresponding The first loss value of the image;
  • the optimization subunit is used to optimize the initial coordinates of the three-dimensional key points of the initial three-dimensional model and the initial pose corresponding to each image according to the first loss value corresponding to each image, Until the new loss value determined by using the optimized three-dimensional model and the optimized pose satisfies the preset condition;
  • the optimized three-dimensional model is the three-dimensional model of the vehicle of the target model.
  • the calibration result acquisition unit 230 may be configured to determine, according to the plurality of monitoring image data, the calibration results of the cameras corresponding to the images in the target image group.
  • FIG. 3 is a schematic structural diagram of an electronic device 300 provided by an embodiment of the present disclosure.
  • the electronic device 300 may be a personal computer, a tablet computer, a smart phone, a personal digital assistant (personal digital assistant, PDA) and the like.
  • PDA personal digital assistant
  • the electronic device 300 may include: a memory 302, a processor 301, a communication interface 303, and a communication bus, and the communication bus is used to implement connection and communication of these components.
  • the memory 302 is used to store various data such as calculation program instructions corresponding to the three-dimensional reconstruction method and device provided by the embodiments of the present disclosure, wherein the memory 302 may be, but not limited to, random access memory, read only memory (Read Only Memory, ROM), Programmable Read-Only Memory (Programmable Read-Only Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), Electric Erasable Programmable Read-Only Memory (Electric Erasable Programmable Read-Only Memory, Only Memory, EEPROM), etc.
  • the processor 301 is used to read and run the computer program instructions corresponding to the three-dimensional reconstruction method and device stored in the memory to obtain a plurality of monitoring image data; the plurality of monitoring image data includes image data of vehicles; according to the A plurality of monitoring image data, determine the target image group of the target model vehicle; the target image group includes: the images of the target model vehicle at different angles of view; obtain the corresponding camera of each image in the target image group The calibration result of the target image group and the calibration results of the cameras corresponding to the images in the target image group to obtain a three-dimensional model of the vehicle of the target model.
  • processor 301 may be an integrated circuit chip, which has a signal processing capability.
  • processor 301 can be general purpose processor, comprises CPU, network processor (Network Processor, NP) etc.; Can also be digital signal processor (DSP), application specific integrated circuit (ASIC), field programmable gate array (FPGA) ) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the communication interface 303 is used for receiving or sending data.
  • an embodiment of the present disclosure also provides a storage medium, in which a computer program is stored, and when the computer program is run on a computer, the computer is made to execute the method provided by any one of the embodiments of the present disclosure. method.
  • the 3D reconstruction method, device, electronic device, and storage medium proposed by the various embodiments of the present disclosure determine the images of the target model vehicle under different viewing angles based on a plurality of monitoring image data, and then, according to each image and The 3D model of the vehicle of the target model is obtained from the calibration result of the corresponding camera.
  • This method does not need to manually scan the vehicle with a 3D scanning device, and the construction of the 3D model of the vehicle can be realized based on the monitoring image data, which is more efficient. Since there is no need for 3D scanning equipment and physical vehicles, the cost is lower and the implementation is easier.
  • each block in a flowchart or block diagram may represent a module, program segment, or part of code that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based device that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.
  • each functional module in each embodiment of the present disclosure may be integrated together to form an independent part, each module may exist independently, or two or more modules may be integrated to form an independent part.
  • the present disclosure provides a three-dimensional reconstruction method, device, electronic equipment, and storage medium.
  • the method includes: acquiring a plurality of monitoring image data; the plurality of monitoring image data includes image data of a vehicle; The target image group of the target model vehicle; the target image group includes the images of the target model vehicle under different viewing angles; obtain the calibration results of the cameras corresponding to each image in the target image group; according to the target image group and each image in the target image group The calibration result of the camera corresponding to the image is used to obtain the 3D model of the vehicle of the target model.
  • This method does not need to scan the vehicle with a hand-held 3D scanning device, and can realize the construction of a 3D model of the vehicle based on the monitoring image data, which is more efficient; easy.
  • the three-dimensional reconstruction method, device, electronic device and storage medium of the present disclosure are reproducible and can be used in various industrial applications.
  • the three-dimensional reconstruction method, device, electronic equipment, and storage medium of the present disclosure can be used in the technical field of image processing.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

La présente divulgation concerne un procédé et un appareil de reconstruction tridimensionnelle, un dispositif électronique et un support de stockage. Le procédé consiste à : acquérir une pluralité d'éléments de données d'image de surveillance, la pluralité d'éléments de données d'image de surveillance comprenant chacun des données d'image d'un véhicule ; déterminer un groupe d'images cibles d'un véhicule d'un modèle cible en fonction de la pluralité d'éléments de données d'image de surveillance, le groupe d'images cibles comprenant des images du véhicule du modèle cible à différents angles de visualisation ; acquérir un résultat d'étalonnage d'une caméra correspondant à chaque image dans le groupe d'images cibles ; et obtenir un modèle tridimensionnel du véhicule du modèle cible en fonction du groupe d'images cibles et du résultat d'étalonnage de la caméra correspondant à chaque image dans le groupe d'images cibles. Au moyen du procédé, il n'est pas nécessaire de tenir manuellement un dispositif de balayage tridimensionnel pour balayer un véhicule, et un modèle de véhicule tridimensionnel peut être construit sur la base de données d'image de surveillance, de sorte que l'efficacité soit plus élevée ; de plus, il n'est pas nécessaire d'utiliser le dispositif de balayage tridimensionnel et un véhicule physique, de sorte que le coût soit inférieur, et la mise en œuvre soit plus facile.
PCT/CN2022/098993 2021-08-13 2022-06-15 Procédé et appareil de reconstruction tridimensionnelle et dispositif électronique et support de stockage WO2023016082A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110931999.1A CN113793413A (zh) 2021-08-13 2021-08-13 三维重建方法、装置、电子设备及存储介质
CN202110931999.1 2021-08-13

Publications (1)

Publication Number Publication Date
WO2023016082A1 true WO2023016082A1 (fr) 2023-02-16

Family

ID=79181817

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/098993 WO2023016082A1 (fr) 2021-08-13 2022-06-15 Procédé et appareil de reconstruction tridimensionnelle et dispositif électronique et support de stockage

Country Status (2)

Country Link
CN (1) CN113793413A (fr)
WO (1) WO2023016082A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113793413A (zh) * 2021-08-13 2021-12-14 北京迈格威科技有限公司 三维重建方法、装置、电子设备及存储介质
CN114299230B (zh) * 2021-12-21 2024-09-10 中汽创智科技有限公司 一种数据生成方法、装置、电子设备及存储介质
CN114821497A (zh) * 2022-02-24 2022-07-29 广州文远知行科技有限公司 目标物位置的确定方法、装置、设备及存储介质
CN115620094B (zh) * 2022-12-19 2023-03-21 南昌虚拟现实研究院股份有限公司 关键点的标注方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190141310A1 (en) * 2018-12-28 2019-05-09 Intel Corporation Real-time, three-dimensional vehicle display
CN109816704A (zh) * 2019-01-28 2019-05-28 北京百度网讯科技有限公司 物体的三维信息获取方法和装置
CN112652056A (zh) * 2020-12-25 2021-04-13 北京奇艺世纪科技有限公司 一种3d信息展示方法及装置
CN112902874A (zh) * 2021-01-19 2021-06-04 中国汽车工程研究院股份有限公司 图像采集装置及方法、图像处理方法及装置、图像处理系统
CN113793413A (zh) * 2021-08-13 2021-12-14 北京迈格威科技有限公司 三维重建方法、装置、电子设备及存储介质

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103996220A (zh) * 2014-05-26 2014-08-20 江苏大学 一种智能交通中的三维重建方法及系统
CN104346833A (zh) * 2014-10-28 2015-02-11 燕山大学 一种基于单目视觉的车辆重构算法
CN110517349A (zh) * 2019-07-26 2019-11-29 电子科技大学 一种基于单目视觉和几何约束的3d车辆目标检测方法
CN111145238B (zh) * 2019-12-12 2023-09-22 中国科学院深圳先进技术研究院 单目内窥镜图像的三维重建方法、装置及终端设备
CN111476798B (zh) * 2020-03-20 2023-05-16 上海遨遥人工智能科技有限公司 一种基于轮廓约束的车辆空间形态识别方法及系统
CN111462249B (zh) * 2020-04-02 2023-04-18 北京迈格威科技有限公司 一种交通摄像头标定方法及装置
CN112489126B (zh) * 2020-12-10 2023-09-19 浙江商汤科技开发有限公司 车辆关键点信息检测方法、车辆控制方法及装置、车辆
CN112541460B (zh) * 2020-12-21 2022-05-13 山东师范大学 一种车辆再识别方法及系统
CN112750203B (zh) * 2021-01-21 2023-10-31 脸萌有限公司 模型重建方法、装置、设备及存储介质
CN112767489B (zh) * 2021-01-29 2024-05-14 北京达佳互联信息技术有限公司 一种三维位姿确定方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190141310A1 (en) * 2018-12-28 2019-05-09 Intel Corporation Real-time, three-dimensional vehicle display
CN109816704A (zh) * 2019-01-28 2019-05-28 北京百度网讯科技有限公司 物体的三维信息获取方法和装置
CN112652056A (zh) * 2020-12-25 2021-04-13 北京奇艺世纪科技有限公司 一种3d信息展示方法及装置
CN112902874A (zh) * 2021-01-19 2021-06-04 中国汽车工程研究院股份有限公司 图像采集装置及方法、图像处理方法及装置、图像处理系统
CN113793413A (zh) * 2021-08-13 2021-12-14 北京迈格威科技有限公司 三维重建方法、装置、电子设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TANG XINYAO, SONG HUANSHENG, WANG WEI, ZHANG CHAOYANG, CUI HUA: "3D Vehicle Information Recognition Algorithm of Monocular Camera Based onSelf-Calibration in Traffic Scene", JOURNAL OF COMPUTER-AIDED DESIGN & COMPUTER GRAPHICS, vol. 32, no. 8, 1 August 2020 (2020-08-01), CN , pages 1305 - 1314, XP093034238, ISSN: 1003-9775, DOI: 10.3724/SP.J.1089.2020.18041 *

Also Published As

Publication number Publication date
CN113793413A (zh) 2021-12-14

Similar Documents

Publication Publication Date Title
WO2023016082A1 (fr) Procédé et appareil de reconstruction tridimensionnelle et dispositif électronique et support de stockage
Taneja et al. Image based detection of geometric changes in urban environments
US9483703B2 (en) Online coupled camera pose estimation and dense reconstruction from video
EP3502621B1 (fr) Localisation visuelle
CN110176032B (zh) 一种三维重建方法及装置
WO2021143935A1 (fr) Procédé de détection, dispositif, appareil électronique et support de stockage
US9091553B2 (en) Systems and methods for matching scenes using mutual relations between features
US10872246B2 (en) Vehicle lane detection system
WO2018120027A1 (fr) Procédé et appareil de détection d'obstacles
CN110879994A (zh) 基于形状注意力机制的三维目测检测方法、系统、装置
CN111837158A (zh) 图像处理方法、装置、拍摄装置和可移动平台
WO2020258297A1 (fr) Procédé de segmentation sémantique d'image, plateforme mobile et support de stockage
CN113240734B (zh) 一种基于鸟瞰图的车辆跨位判断方法、装置、设备及介质
WO2024016524A1 (fr) Procédé et appareil d'estimation de position de véhicule connecté basés sur un échantillonnage incrémentiel non uniforme indépendant
CN111928842B (zh) 一种基于单目视觉实现slam定位的方法及相关装置
WO2023284358A1 (fr) Procédé et appareil d'étalonnage de caméra, dispositif électronique et support de stockage
WO2023016182A1 (fr) Procédé et appareil de détermination de pose, dispositif électronique et support d'enregistrement lisible
CN114898321B (zh) 道路可行驶区域检测方法、装置、设备、介质及系统
CN116051736A (zh) 一种三维重建方法、装置、边缘设备和存储介质
CN110673607A (zh) 动态场景下的特征点提取方法、装置、及终端设备
Kume et al. Bundle adjustment using aerial images with two-stage geometric verification
CN116823966A (zh) 相机的内参标定方法、装置、计算机设备和存储介质
CN112598736A (zh) 一种基于地图构建的视觉定位方法及装置
CN116721109B (zh) 一种双目视觉图像半全局匹配方法
TWI855330B (zh) 三維目標檢測方法、電子設備及計算機可讀存儲媒體

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22855067

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13/06/2024)