WO2023016182A1 - Pose determination method and apparatus, electronic device, and readable storage medium - Google Patents

Pose determination method and apparatus, electronic device, and readable storage medium Download PDF

Info

Publication number
WO2023016182A1
WO2023016182A1 PCT/CN2022/105549 CN2022105549W WO2023016182A1 WO 2023016182 A1 WO2023016182 A1 WO 2023016182A1 CN 2022105549 W CN2022105549 W CN 2022105549W WO 2023016182 A1 WO2023016182 A1 WO 2023016182A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
error value
model
information
initial
Prior art date
Application number
PCT/CN2022/105549
Other languages
French (fr)
Chinese (zh)
Inventor
王彬
Original Assignee
北京迈格威科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京迈格威科技有限公司 filed Critical 北京迈格威科技有限公司
Publication of WO2023016182A1 publication Critical patent/WO2023016182A1/en

Links

Images

Classifications

    • G06T3/08
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration

Definitions

  • the present application relates to the field of information processing, and in particular, relates to a pose determination method, device, electronic equipment, and readable storage medium.
  • the pose parameters of the target object can often be estimated in the following two ways.
  • One is to first render its images in different poses through the 3D model of the target object, and then use the rendered images as the input of the convolutional neural network, and use the attitude parameters corresponding to different poses as the expected output of the convolutional neural network , to train a convolutional neural network. After the convolutional neural network converges, it can be used for pose estimation.
  • the image to be processed can be used as the input of the network, and the output is the corresponding pose parameters. Since this image recognition-based method does not have a strict projection relationship equation, it is difficult to obtain accurate estimates of the target pose parameters, and its generalization is poor.
  • the second is to use the deep learning method to predict the two-dimensional key points of the target object, and combine the three-dimensional model of the target object and the camera calibration information to estimate the projection relationship between the model points of the three-dimensional model and the two-dimensional key points of the target object in the image pose.
  • This method can help improve the accuracy and robustness of the estimation results, but this method needs to pre-generate the 3D model of the target object and an accurate model matching algorithm, which brings great difficulty to the estimation process.
  • the purpose of the embodiments of the present application is to provide a pose determination method, device, electronic equipment, and readable storage medium to simultaneously obtain the target model and the target pose, thereby solving the problem of the target object during the pose estimation process of the target object.
  • the target model cannot be pre-generated and the matching between the target object and the target model is difficult.
  • An embodiment of the present application provides a pose determination method, which may include: acquiring a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to a target object; for each of the target images, based on the The target key point information of the target image, determining the initial pose information of the target object; determining the initial model of the target object; based on the initial model and the initial pose information corresponding to each of the plurality of target images, A target model of the target object and a target pose of the target object are determined.
  • the target model and the target pose can be determined at the same time, so that the object model does not need to be generated in advance and the matching algorithm between the target object and the object model does not need to be designed, and the processing steps of the pose estimation process are optimized.
  • the pose determination method may further include: acquiring prior information that matches the target object; the prior information is used to characterize the structure information and/or size information of the target object; the Based on the initial model and the initial pose information corresponding to each of the plurality of target images, determining the target model of the target object and the target pose of the target object includes: based on the prior information, the The initial model and the initial pose information corresponding to each of the plurality of target images are used to determine the target model of the target object and the target pose of the target object.
  • the structure and/or size of the target object can be constrained by adding prior information to improve the optimization accuracy and make the obtained target model and target pose more accurate.
  • determining the target model of the target object and the target position of the target object based on the prior information, the initial model, and the initial pose information corresponding to each of the plurality of target images pose may include: calculating a minimum error value based on the prior information, the initial model, and a plurality of the initial pose information; wherein, the minimum error value is characterized by an object specification error value and a reprojection error value
  • the object specification error value represents the error between the structure information and/or size information corresponding to the obtained object model and the prior information
  • the reprojection error value represents the obtained object model in the corresponding target image
  • the reprojection error between the projected point and the corresponding key point; the object model optimized when the minimum error value is obtained is determined as the target model, and the pose optimized when the minimum error value is obtained is determined as The target pose.
  • an implementation manner is provided that can simultaneously determine the target model and the target pose.
  • the minimum error value can be obtained based on the following steps: when it is detected that the error value is smaller than the error threshold, the error value is determined as the minimum error value; or when the number of iterations reaches the upper limit of iterations, a plurality of errors The minimum error value among the values is determined as the minimum error value; wherein, the iteration number threshold corresponding to the iteration upper limit is matched with the number of the target image included in the image set to be processed.
  • two implementation manners for determining the minimum error value are provided, and one can be used in an actual application process.
  • the calculating the minimum error value based on the prior information, the initial model, and a plurality of initial pose information may include: for each iteration calculation, based on the prior information and the current optimizing the current model of the target object obtained, and calculating the current object specification error value; and based on the current model and the current corresponding initial pose information, calculating the current reprojection error value; and based on the current time
  • the second object specification error value and the current reprojection error value determine the minimum error value.
  • the minimum error value can be determined based on the object specification error value and the reprojection error value, so that the determined target model is closer to the target object, and the determined target pose is consistent with the actual position of the target object at the time when the target image is taken. posture is closer.
  • the determining the minimum error value based on the current object specification error value and the current reprojection error value may include: combining the current object specification error value with the current reprojection error value The product or sum corresponding to the projection error value is determined as the error value calculated by the current iteration; and the minimum error value is determined based on the error value calculated by the current iteration and the error value calculated by the previous iteration.
  • a way is provided in which the minimum error value can be determined.
  • the target key point information includes position information of the target key point; and for each of the target images, based on the target key point information of the target image, determining the initial pose information of the target object, It may include: based on the position information of the target key point and the camera calibration information, determine the initial position information of the target key point in the world coordinate system; the world coordinate system includes the motion plane of the target object as the coordinate plane A coordinate system; determining initial yaw angle information matching the initial position information; determining initial pose information of the target object based on the initial position information and the initial yaw angle information. In order to help determine more accurate target pose information.
  • the determining the initial yaw angle information matching the initial position information may include: within the target angle range, selecting a plurality of angles to match the initial position information, and determining that the target key point corresponds to The reprojection error values corresponding to the key points of the model at various angles; the angle information corresponding to the smallest reprojection error value is determined as the initial yaw angle information. In order to make the initial yaw angle information closer to the target yaw angle information.
  • the initial model may be pre-determined based on the following steps: for each object model in the object model library, determine the projection point of the model key point representing the object model in the target image; and determine the projection point and Reprojection error values between corresponding key points of the target image; determining the object model corresponding to the smallest reprojection error value as the initial model.
  • the initial model can be closer to the target model, and the rate of determining the target model can be accelerated.
  • the set of images to be processed may include a plurality of target images determined from the moving path of the target object.
  • relatively close initial pose information can be obtained through multiple target images that can represent the motion trajectory of the target object.
  • the embodiment of the present application also provides a device for determining a pose, which may include: an acquisition module configured to acquire an image set to be processed; the image set to be processed includes a plurality of target images corresponding to a target object; A determining module, configured to determine the initial pose information of the target object based on the target key point information of the target image for each target image; a second determining module, configured to determine the an initial model of the target object; a third determining module, configured to determine the target model of the target object and the initial pose information corresponding to each of the multiple target images based on the initial model and The target pose of the target object.
  • an acquisition module configured to acquire an image set to be processed
  • the image set to be processed includes a plurality of target images corresponding to a target object
  • a determining module configured to determine the initial pose information of the target object based on the target key point information of the target image for each target image
  • a second determining module configured to determine the an initial model of the target object
  • a third determining module configured to
  • the embodiment of the present application also provides an electronic device, the electronic device includes a processor and a memory, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the operation
  • the steps in the method are as provided above.
  • the embodiment of the present application also provides a readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the method provided above are performed.
  • the embodiment of the present application also provides a computer program product, the computer program product includes a computer program, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer can execute the above-mentioned steps in the method described above.
  • FIG. 1 is a flow chart of a pose determination method provided in an embodiment of the present application
  • FIG. 2 is a flow chart of another pose determination method provided in the embodiment of the present application.
  • Fig. 3 is a structural block diagram of a pose determination device provided by an embodiment of the present application.
  • Fig. 4 is a schematic structural diagram of an electronic device for performing a pose determination method provided by an embodiment of the present application.
  • Artificial Intelligence is an emerging science and technology that studies and develops theories, methods, technologies and application systems for simulating and extending human intelligence.
  • the subject of artificial intelligence is a comprehensive subject that involves many technologies such as chips, big data, cloud computing, Internet of Things, distributed storage, deep learning, machine learning, and neural networks.
  • computer vision is specifically to allow machines to recognize the world.
  • Computer vision technology usually includes face recognition, liveness detection, fingerprint recognition and anti-counterfeiting verification, biometric recognition, face detection, pedestrian detection, target detection, pedestrian detection, etc.
  • the present application provides a pose determination method, device, electronic equipment and readable storage medium. By first obtaining an initial model that is different from the target model, and then based on the initial model and the initial pose information of the target object shown in multiple target images, simultaneously determine the target model and target pose of the target object, so that The above-mentioned problems are solved without pre-generating the target model and without designing a matching algorithm between the target object and the target model. In practical applications, this application can be applied to the corresponding pose estimation process of vehicles, drones, etc.
  • the present application takes the pose estimation process applied to a vehicle as an example to illustrate the pose determination method.
  • the aforementioned target object may include a target vehicle. It is necessary to ensure the accuracy of the target pose of the target vehicle. For example, in the field of intelligent traffic monitoring, situations such as counting traffic flow and judging whether the driver is driving illegally can be performed through the determined target pose.
  • FIG. 1 shows a flowchart of a first method for determining a pose provided by an embodiment of the present application.
  • the pose determination method may include the following steps 101 to 104 .
  • Step 101 acquiring a set of images to be processed;
  • the set of images to be processed includes a plurality of target images corresponding to a target object;
  • the aforementioned target image may include, for example, images corresponding to target objects such as a target truck, a target van, and a target car.
  • these images may be sorted to obtain the above image set to be processed.
  • the above-mentioned target image can be obtained through network resources, can also be obtained by using a camera on the spot, and can also be obtained through pre-recorded video.
  • the intercepted image including the target object may be regarded as the target image when it is acquired through video recording.
  • the set of images to be processed may include multiple target images determined from a moving path of the target object.
  • multiple images may be captured in the moving path of the target object, and the captured images may be regarded as target images.
  • the multiple captured target images can be used to characterize the movement trajectory of the target object.
  • the closer initial pose information can be obtained through multiple target images that can represent the motion trajectory of the target object.
  • Step 102 for each target image, determine the initial pose information of the target object based on the target key point information of the target image;
  • the initial pose information of the target object can be determined for the target key point information of each target image in the image set to be processed.
  • the above target key point information can be regarded as information of key points in the target image that can be used to characterize the position of the target object.
  • the key points may include, for example, corresponding points in the image of the target vehicle, such as the front logo of the vehicle, the left-view mirror of the vehicle, and the right-view mirror of the vehicle.
  • one or more of the above-mentioned key points such as the vehicle front logo, vehicle left-view mirror, and vehicle right-view mirror can be selected as target key points, and the image coordinate information corresponding to the target key points can be regarded as The key point information of the above target.
  • the image coordinate information here may be (u, v), for example.
  • the coordinate parameters "u" and "v” here can be any value in the image coordinate system.
  • the above initial pose information can be regarded as being able to roughly represent the position information and attitude information of the target object in the actual application scene.
  • the initial pose information may be represented by coordinate information (X, Y, ⁇ ) of the target object in the world coordinate system, for example.
  • coordinate information X, Y, ⁇
  • the above-mentioned coordinate parameters "X”, "Y", and “ ⁇ ” may be any corresponding values in the world coordinate system.
  • " ⁇ " can be regarded as the pose information of the target object.
  • Step 103 determining the initial model of the target object
  • the aforementioned initial model may be, for example, a predetermined model similar to the target object.
  • the size of the model can be quite different from the size corresponding to the target model, so that the target model does not need to be generated in advance.
  • the target object is a car
  • the initial model can be, for example, a van model.
  • a vehicle model library may be prepared in advance, and the vehicle model library may, for example, be designed with reference to various vehicle types in actual applications.
  • the initial model is pre-determined based on the following steps:
  • Step A for each object model in the object model library, determine the projection point of the model key point representing the object model in the target image;
  • the above-mentioned object model library may be, for example, the above-mentioned vehicle model library.
  • the aforementioned key points of the model may include, for example, key points of the vehicle model that can substantially represent the object model, such as the logo of the vehicle model and the left-view mirror.
  • the projection point of the key point of the front vehicle logo representing the vehicle model in the target vehicle image may be determined.
  • the above projection point can be obtained, for example, by a projection formula.
  • the image coordinates corresponding to the projection point can be calculated as (u, v) using the projection formula.
  • the above-mentioned u and v can represent any number in the coordinate system to which they belong, and the above-mentioned x w , y w , and z w can respectively represent the length information, width information and height l information of the target vehicle model.
  • the above-mentioned X, Y can be any value in the coordinate system to which they belong, and the above-mentioned ⁇ can be any value within (0, 2 ⁇ ).
  • the projection formula above may include, for example: Among them, ⁇ is the scale factor, K is the internal parameter of the camera, and P is the external parameter of the camera.
  • Step B determining the reprojection error value between the projection point and the corresponding key point of the target image
  • the reprojection error value between the projection point and the corresponding key point in the target image can be calculated. For example, after it is determined to use the vehicle front logo as the vehicle model key point A, the projection point A' of the model key point A in the target image can be obtained based on the above projection formula. The reprojection error value between projected point A' and its corresponding keypoint a can then be determined. In some application scenarios, for example, the above reprojection error value may be calculated by the least square method.
  • the manner of calculating the reprojection error value by using the least square method is a related technology, which will not be described in detail here.
  • Step C determining the object model corresponding to the smallest reprojection error value as the initial model.
  • the magnitudes of each reprojection error value can be compared, and then the smallest reprojection error value can be determined.
  • the object model corresponding to the minimum reprojection error value may be determined as the above initial model. For example, when it is determined that the reprojection error value corresponding to vehicle model A is the smallest, vehicle model A may be determined as the initial model.
  • the initial model can be determined through the calculated reprojection error value. This makes the initial model closer to the target model and speeds up the determination of the target model.
  • Step 104 Determine a target model of the target object and a target pose of the target object based on the initial model and the initial pose information corresponding to each of the plurality of target images.
  • the least square method can be used for continuous iterative optimization to obtain the above-mentioned target model and target pose.
  • the above target model may be applied to measure length information, height information, and/or width information of the vehicle, for example.
  • the target model and the target pose can be determined at the same time, so that there is no need to pre-generate the object model and design a matching algorithm between the target object and the object model, and optimize the processing steps of the pose estimation process.
  • FIG. 2 shows a flowchart of another image processing method provided by an embodiment of the present application.
  • the image processing method may include the following steps 201 to 205 .
  • Step 201 acquiring a set of images to be processed;
  • the set of images to be processed includes a plurality of target images corresponding to a target object;
  • step 201 and the obtained technical effect may be the same as or similar to the step 101 of the embodiment shown in FIG. 1 , and details are not described here.
  • Step 202 for each target image, based on the target key point information of the target image, determine the initial pose information of the target object;
  • step 202 The implementation process of the above-mentioned step 202 and the technical effect obtained may be the same as or similar to the step 102 of the embodiment shown in FIG. 1 , and will not be repeated here.
  • Step 203 determining the initial model of the target object
  • step 203 The implementation process of the above step 203 and the obtained technical effect may be the same as or similar to the step 103 of the embodiment shown in FIG. 1 , and details are not described here.
  • Step 204 acquiring prior information matching the target object; the prior information is used to characterize the structure information and/or size information of the target object;
  • prior information of the target object can be obtained.
  • the above-mentioned prior information may include, for example, size information of the target object such as length information, width information, and height information, as well as coplanar information between the license plate and the front logo, and symmetry information between the left-view mirror and the right-view mirror. information.
  • Step 205 Determine the target model of the target object and the target pose of the target object based on the prior information, the initial model, and the initial pose information corresponding to each of the plurality of target images.
  • the target model and target pose can be determined simultaneously.
  • the structure and/or size of the target object can be constrained by adding prior information to improve the optimization accuracy and make the obtained target model and target pose more accurate.
  • the target model and target pose can be determined simultaneously through the following sub-steps:
  • Sub-step 2051 calculate a minimum error value based on the prior information, the initial model, and a plurality of the initial pose information; wherein, the minimum error value is characterized by an object specification error value and a reprojection error value;
  • the object specification error value represents an error between the structure information and/or size information corresponding to the obtained object model and the prior information;
  • the reprojection error value represents the difference between the obtained object model in the corresponding target image The reprojection error between the projected point and the corresponding keypoint;
  • the minimum error value can be calculated using a non-linear least squares method.
  • the minimum error value can be characterized by object specification error value and reprojection error value.
  • the minimum product obtained by multiplying the object specification error value and the reprojection error value can be used as the minimum error value; or the minimum sum obtained by adding the object specification error value and the reprojection error value can be used as the minimum error value.
  • the sum obtained by adding the object specification error value and the reprojection error value can be the algebraic sum of the two, or the weighted sum of the two, which can be selected according to the actual situation, and is not limited here.
  • the aforementioned weighted sum may, for example, be realized by assigning different weight values to the two.
  • the above substep 2051 may include the following steps:
  • Step 1 for each iterative calculation, based on the prior information and the current model of the target object obtained by the current optimization, calculate the current object specification error value;
  • the specification error value of the current object can be determined based on the prior information and the currently obtained current model of the target object.
  • the above current target specification error value can be regarded as representing the error between the target model obtained at the current time and the prior information, for example.
  • the prior information of the target vehicle is: the height is 3 meters, the width is 2 meters, and the length is 5 meters.
  • the obtained object model represents the target vehicle: the height is 3 meters, the width is 1.8 meters, and the length is 5 meters.
  • Step 2 based on the current model and the current corresponding initial pose information, calculate the current reprojection error value
  • the current corresponding reprojection error value can be determined based on the currently obtained object model and the current corresponding initial pose information. For example, the target image corresponding to the current corresponding initial pose information can be determined, and then the projection point corresponding to the key point of the model that can characterize the currently obtained object model can be determined, and the relationship between the projection point and the corresponding key point of the target image can be calculated.
  • the reprojection error value between can be made to relevant parts of step B in the foregoing embodiments, and details are not repeated here.
  • Step 3 Determine the minimum error value based on the current object specification error value and the current reprojection error value.
  • the above minimum error value can be determined. For example, different weights may be assigned to the current specification error value and the current reprojection error value to determine the aforementioned minimum error value.
  • this embodiment introduces the object specification error value through the above sub-steps 1 to 3, and then the minimum error value can be determined based on the object specification error value and the reprojection error value, so that the determined target model is closer to the target object. Closeness, the determined target pose is closer to the actual pose of the target object at the moment when the target image is captured.
  • the above step three may include: firstly, determining the product or sum corresponding to the current object specification error value and the current reprojection error value as the current iteration calculation difference;
  • the current specification error value and the current reprojection error value can be accumulated to obtain the sum of the two, and then the error value corresponding to the current iteration can be obtained.
  • the current specification error value can also be multiplied by the current reprojection error value to obtain the product of the two, and then the error value corresponding to the current iteration can be obtained.
  • the minimum error value is determined based on the error value calculated by the current iteration and the error value calculated by historical iterations.
  • Each iterative calculation can correspond to an error value. After obtaining the error value corresponding to the current iteration, the error value can be compared with the error value obtained by historical iterative calculations to determine the current corresponding minimum error value.
  • the object model optimized when the minimum error value is obtained is determined as the target model, and the pose optimized when the minimum error value is obtained is determined as the target pose.
  • the object model corresponding to the minimum error value can be determined as the target model, and the corresponding pose at this time can be determined as the target pose.
  • the minimum error value is obtained based on the following steps: when it is detected that the error value is less than an error threshold, the error value is determined as the minimum error value; or when the number of iterations reaches the upper limit of iterations, Determining the smallest error value among the plurality of error values as the minimum error value; wherein, when the iteration upper limit is reached, the corresponding iteration number threshold matches the number of the target image included in the image set to be processed.
  • the iterative calculation may be stopped when it is detected that the error value is smaller than the error threshold.
  • the object model optimized at this time may be determined as the target model, and the corresponding pose information obtained at this time may be determined as the target pose information. In this way, the calculation amount of the iterative calculation can be reduced while the current target model basically matches the target object, and the target pose basically matches the actual pose of the target object.
  • the error threshold may include, for example, 0.08, 0.1, etc., which can substantially make the object model corresponding to the error value be regarded as the target model.
  • the iterative calculation can also be stopped when the maximum number of iterations is reached.
  • the initial pose information corresponding to each target image is used, and when the iterative calculation process can no longer continue, it can be regarded as reaching the maximum number of iterations.
  • the object model obtained at this time can be determined as the target model, and the currently obtained pose information can be determined as the target pose.
  • the target key point information includes position information of the target key point; and step 102 in the above embodiment shown in FIG. 1 or step 202 in the embodiment shown in FIG. 2 may include the following sub-steps:
  • Sub-step 1 based on the position information of the target key point and the camera calibration information, determine the initial position information of the target key point in the world coordinate system;
  • the world coordinate system includes the motion plane of the target object as the coordinate plane coordinate system;
  • the above camera calibration information may include an internal reference matrix and an external reference matrix of the camera, so as to correct the captured image and obtain a target image with less distortion.
  • the above-mentioned world coordinate system may include a coordinate system using the motion plane of the target object as a coordinate plane.
  • the road surface coordinate system in which the moving road surface of the target vehicle is taken as the horizontal and vertical coordinate plane can be regarded as the world coordinate system.
  • the initial position information of the target key point in the world coordinate system can be determined.
  • the projection formula can be used to determine the initial position information of the target key point in the world coordinate system ( X, Y, Z).
  • the initial position information may be (X, Y, 0).
  • Sub-step 2 determining the initial yaw angle information matching the initial position information
  • the initial yaw angle information matching the initial position information can be determined further.
  • the above initial yaw angle information can be determined based on the following steps: first, within the target angle range, select multiple angles to match the initial position information, and determine the target key point corresponding The reprojection error values corresponding to the key points of the model at each angle;
  • the aforementioned target angle range may include (0, 2 ⁇ ), for example.
  • multiple angles can be selected within the target angle range to match with the initial position information respectively. For example, 90°, 180°, 270°, 360°, etc. may be selected at equal intervals to match the initial position information respectively. Further, after selecting 90° as the angle for matching with the initial position information, the pose information of the key points of the above model can be (X, Y, 90°), at this time, the projection point corresponding to the key point of the model can be calculated The reprojection error from the target keypoint. Similarly, the reprojection errors corresponding to multiple model key points at their respective angles can be calculated respectively.
  • the angle information corresponding to the smallest reprojection error value is determined as the initial yaw angle information.
  • angle information corresponding to the smallest reprojection error value may be determined as initial yaw angle information, so that the initial yaw angle information is closer to the target yaw angle information. For example, when it is determined that the reprojection error value corresponding to the key point of the model whose pose information is (X, Y, 90°) is the smallest, 90° may be determined as the initial yaw angle information.
  • Sub-step 3 based on the initial position information and the initial yaw angle information, determine the initial pose information of the target object.
  • the initial pose information of the target object can be determined. For example, when the initial position information (X, Y, 0) and the initial yaw angle information of 90° are determined, the initial pose information (X, Y, 90°) can be determined.
  • the initial pose information of the target object can be roughly determined, which is beneficial to determine more accurate target pose information.
  • FIG. 3 shows a structural block diagram of an apparatus for determining a pose provided by an embodiment of the present application.
  • the apparatus for determining a pose may be a module, program segment or code on an electronic device.
  • the device corresponds to the above-mentioned method embodiment in FIG. 1 , and can execute various steps involved in the method embodiment in FIG. 1 .
  • the specific functions of the device can refer to the description above. To avoid repetition, detailed descriptions are appropriately omitted here.
  • the above-mentioned device for determining a pose may include an acquisition module 301 , a first determination module 302 , a second determination module 303 and a third determination module 304 .
  • the obtaining module 301 is configured to obtain a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to the target object; the first determination module 302 is configured to image, based on the target key point information of the target image, determine the initial pose information of the target object; the second determining module 303 is configured to determine the initial model of the target object; the third determining module 304 is configured to It is configured to determine a target model of the target object and a target pose of the target object based on the initial model and the initial pose information respectively corresponding to the plurality of target images.
  • the apparatus for determining pose and posture may further include an information acquisition module configured to: acquire prior information matching the target object; the prior information is used to characterize the Structural information and/or size information of the target object; the third determination module 304 is further configured to: based on the prior information, the initial model, and the initial position corresponding to each of the plurality of target images pose information, and determine the target model of the target object and the target pose of the target object.
  • an information acquisition module configured to: acquire prior information matching the target object; the prior information is used to characterize the Structural information and/or size information of the target object;
  • the third determination module 304 is further configured to: based on the prior information, the initial model, and the initial position corresponding to each of the plurality of target images pose information, and determine the target model of the target object and the target pose of the target object.
  • the third determining module 304 may be further configured to: calculate a minimum error value based on the prior information, the initial model, and a plurality of initial pose information; wherein, the minimum The error value is characterized by an object specification error value and a reprojection error value; the object specification error value represents the error between the structure information and/or size information corresponding to the obtained object model and the prior information; the reprojection The projection error value characterizes the reprojection error between the projected point of the object model in the corresponding target image and the corresponding key point; the object model optimized when the minimum error value is obtained is determined as the target model, and The pose optimized when the minimum error value is obtained is determined as the target pose.
  • the minimum error value can be obtained based on the following steps: when it is detected that the error value is smaller than the error threshold, the error value is determined as the minimum error value; or when the number of iterations reaches the upper limit of iterations, a plurality of errors The minimum error value among the values is determined as the minimum error value; wherein, the iteration number threshold corresponding to the iteration upper limit is matched with the number of the target image included in the image set to be processed.
  • the third determination module 304 may be further configured to: for each iterative calculation, based on the prior information and the current model of the target object obtained by current optimization, calculate the current object A specification error value; and based on the current model and the current corresponding initial pose information, calculate the current reprojection error value; and based on the current object specification error value and the current reprojection error value, determine The minimum error value.
  • the third determination module 304 may be further configured to: determine the product or sum corresponding to the current object specification error value and the current reprojection error value as the current iterative calculation The obtained error value; and determining the minimum error value based on the error value calculated by the current iteration and the error value calculated by historical iterations.
  • the first determination module 302 may be further configured to: determine the initial position information of the target key point in the world coordinate system based on the position information of the target key point and the camera calibration information;
  • the world coordinate system includes a coordinate system using the motion plane of the target object as a coordinate plane; determining initial yaw angle information matching the initial position information; based on the initial position information and the initial yaw angle information, determining The initial pose information of the target object.
  • the first determining module 302 may be further configured to: within the target angle range, select multiple angles to match the initial position information, and determine that the model key points corresponding to the target key points are at Re-projection error values corresponding to each angle; the angle information corresponding to the smallest re-projection error value is determined as the initial yaw angle information.
  • the initial model may be determined in advance based on the following steps: for each object model in the object model library, determine the projection point of the model key point representing the object model in the target image; and determine the projection point and Reprojection error values between corresponding key points of the target image; determining the object model corresponding to the smallest reprojection error value as the initial model.
  • the set of images to be processed may include a plurality of target images determined from the moving path of the target object.
  • FIG. 4 is a schematic structural diagram of an electronic device for performing a pose determination method provided by an embodiment of the present application.
  • the electronic device may include: at least one processor 401, such as a CPU, and at least one communication interface 402 , at least one memory 403 and at least one communication bus 404 .
  • the communication bus 404 is used to realize the direct connection and communication of these components.
  • the communication interface 402 of the device in the embodiment of the present application is used for signaling or data communication with other node devices.
  • the memory 403 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory.
  • the memory 403 may also be at least one storage device located far away from the aforementioned processor.
  • Computer-readable instructions are stored in the memory 403 , and when the computer-readable instructions are executed by the processor 401 , the electronic device may, for example, execute the above-mentioned method process shown in FIG. 1 .
  • FIG. 4 is only for illustration, and the electronic device may also include more or less components than those shown in FIG. 4 , or have a configuration different from that shown in FIG. 4 .
  • Each component shown in FIG. 4 may be implemented by hardware, software or a combination thereof.
  • An embodiment of the present application provides a readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the method process performed by the electronic device in the method embodiment shown in FIG. 1 is executed.
  • the computer program product includes a computer program
  • the computer program includes program instructions.
  • the computer can execute the methods provided in the above method embodiments.
  • the method may include: acquiring a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to the target object; for each target image, based on the target key point information of the target image, determine the The initial pose information of the target object; determining the initial model of the target object; based on the initial model and the initial pose information corresponding to each of the plurality of target images, determining the target model of the target object and the The target pose of the target object.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional module in each embodiment of the present application may be integrated to form an independent part, each module may exist independently, or two or more modules may be integrated to form an independent part.
  • the present application provides a pose determining method, device, electronic equipment and readable storage medium.
  • a pose determining method By first obtaining an initial model that is different from the target model, and then based on the initial model and the initial pose information of the target object shown in multiple target images, simultaneously determine the target model and target pose of the target object, so that There is no need to pre-generate the target model and no need to design a matching algorithm between the target object and the target model, thereby solving the problems that the target object model cannot be pre-generated and the matching between the target object and the target model is difficult during the attitude estimation process of the target object.
  • the pose determining method, device, electronic device and readable storage medium of the present application are reproducible and can be used in various industrial applications.
  • the pose determination method, device, electronic device, and readable storage medium of the present application can be used in corresponding pose estimation processes such as vehicles and drones.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Provided by the present application are a pose determination method and apparatus, an electronic device, and a readable storage medium. A specific embodiment of the method comprises: obtaining an image set to be processed, the image set to be processed comprising a plurality of target images; for each of the target images, on the basis of target key point information of the target image, determining initial pose information of a target object; and on the basis of an initial model and the initial pose information corresponding to each of the plurality of target images, determining a target model and a target pose. In the described method, a target model and target pose can be determined at the same time, so that there is no need to generate an object model in advance and no need to design a matching algorithm between the target object and the object model, optimizing the processing steps of a pose estimation process.

Description

位姿确定方法、装置、电子设备和可读存储介质Pose determination method, device, electronic device and readable storage medium
相关申请的交叉引用Cross References to Related Applications
本申请要求于2021年08月13日提交中国国家知识产权局的申请号为202110932980.9名称为“位姿确定方法、装置、电子设备和可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202110932980.9 titled "Pose Determination Method, Device, Electronic Equipment, and Readable Storage Medium" submitted to the State Intellectual Property Office of China on August 13, 2021, the entire content of which Incorporated in this application by reference.
技术领域technical field
本申请涉及信息处理领域,具体而言,涉及一种位姿确定方法、装置、电子设备和可读存储介质。The present application relates to the field of information processing, and in particular, relates to a pose determination method, device, electronic equipment, and readable storage medium.
背景技术Background technique
随着智能系统的不断发展,对监控视频的自动化处理要求也越来越高。在监控视频或者拍摄图像中确定目标对象的姿态时,由于图像数据在采集过程中丢失了三维世界中的深度信息,使得目标对象的姿态参数难以恢复。With the continuous development of intelligent systems, the requirements for automatic processing of surveillance videos are also getting higher and higher. When the pose of the target object is determined in the surveillance video or captured image, since the image data loses the depth information in the three-dimensional world during the acquisition process, it is difficult to restore the pose parameters of the target object.
相关技术中,常常可以通过以下两种方式估计目标对象的姿态参数。其一是先通过目标对象的三维模型渲染出其在不同姿态下的图像,然后将渲染出的图像作为卷积神经网络的输入,将不同姿态所对应的姿态参数作为卷积神经网络的期望输出,训练卷积神经网络。当卷积神经网络收敛之后,可以用于位姿估计。这样,在进行目标姿态估计时,可以将待处理图像作为网络输入,输出即为对应的姿态参数。这种基于图像识别的方法由于不存在严格的投影关系方程,因此很难得到目标姿态参数的准确估计值,而且泛化性很差。其二是利用深度学习方法预测目标对象的二维关键点,并结合目标对象的三维模型和相机标定信息,利用三维模型的模型点和图像中目标对象的二维关键点之间的投影关系估计位姿。该方法可以帮助提升估计结果的准确性和鲁棒性,但是该方法需要预先生成目标对象的三维模型以及准确的模型匹配算法,给估计过程带来了较大难度。In related technologies, the pose parameters of the target object can often be estimated in the following two ways. One is to first render its images in different poses through the 3D model of the target object, and then use the rendered images as the input of the convolutional neural network, and use the attitude parameters corresponding to different poses as the expected output of the convolutional neural network , to train a convolutional neural network. After the convolutional neural network converges, it can be used for pose estimation. In this way, when performing target pose estimation, the image to be processed can be used as the input of the network, and the output is the corresponding pose parameters. Since this image recognition-based method does not have a strict projection relationship equation, it is difficult to obtain accurate estimates of the target pose parameters, and its generalization is poor. The second is to use the deep learning method to predict the two-dimensional key points of the target object, and combine the three-dimensional model of the target object and the camera calibration information to estimate the projection relationship between the model points of the three-dimensional model and the two-dimensional key points of the target object in the image pose. This method can help improve the accuracy and robustness of the estimation results, but this method needs to pre-generate the 3D model of the target object and an accurate model matching algorithm, which brings great difficulty to the estimation process.
发明内容Contents of the invention
本申请实施例的目的在于提供一种位姿确定方法、装置、电子设备和可读存储介质,用以同时得到目标模型以及目标位姿,从而解决了在目标对象的姿态估计过程中,目标对象的目标模型无法预先生成以及目标对象与目标模型之间匹配困难的问题。The purpose of the embodiments of the present application is to provide a pose determination method, device, electronic equipment, and readable storage medium to simultaneously obtain the target model and the target pose, thereby solving the problem of the target object during the pose estimation process of the target object. The target model cannot be pre-generated and the matching between the target object and the target model is difficult.
本申请实施例提供了一种位姿确定方法,该方法可以包括:获取待处理图像集;所述待处理图像集中包括目标对象对应的多个目标图像;针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿信息;确定所述目标对象的初始模型;基于所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。这样,可以同时确定出目标模型和目标 位姿,使得不用预先生成对象模型以及无需设计目标对象与对象模型之间的匹配算法,优化了位姿估计过程的处理步骤。An embodiment of the present application provides a pose determination method, which may include: acquiring a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to a target object; for each of the target images, based on the The target key point information of the target image, determining the initial pose information of the target object; determining the initial model of the target object; based on the initial model and the initial pose information corresponding to each of the plurality of target images, A target model of the target object and a target pose of the target object are determined. In this way, the target model and the target pose can be determined at the same time, so that the object model does not need to be generated in advance and the matching algorithm between the target object and the object model does not need to be designed, and the processing steps of the pose estimation process are optimized.
可选地,所述位姿确定方法还可以包括:获取与所述目标对象相匹配的先验信息;所述先验信息用于表征所述目标对象的结构信息和/或尺寸信息;所述基于所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿,包括:基于所述先验信息、所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。这样,可以通过添加先验信息约束目标对象的结构和/或尺寸,提高优化精度,使得到的目标模型、目标位姿更加准确。Optionally, the pose determination method may further include: acquiring prior information that matches the target object; the prior information is used to characterize the structure information and/or size information of the target object; the Based on the initial model and the initial pose information corresponding to each of the plurality of target images, determining the target model of the target object and the target pose of the target object includes: based on the prior information, the The initial model and the initial pose information corresponding to each of the plurality of target images are used to determine the target model of the target object and the target pose of the target object. In this way, the structure and/or size of the target object can be constrained by adding prior information to improve the optimization accuracy and make the obtained target model and target pose more accurate.
可选地,所述基于所述先验信息、所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿,可以包括:基于所述先验信息、所述初始模型以及多个所述初始位姿信息,计算最小误差值;其中,所述最小误差值通过对象规格误差值和重投影误差值进行表征;所述对象规格误差值表征得到的对象模型所对应的结构信息和/或尺寸信息与所述先验信息之间的误差;所述重投影误差值表征得到的对象模型在对应的目标图像中的投影点与对应关键点之间的重投影误差;将得到所述最小误差值时优化得到的对象模型确定为所述目标模型,以及将得到所述最小误差值时优化得到的位姿确定为所述目标位姿。这里,提供了一种可以同时确定目标模型和目标位姿的实施方式。Optionally, determining the target model of the target object and the target position of the target object based on the prior information, the initial model, and the initial pose information corresponding to each of the plurality of target images pose, may include: calculating a minimum error value based on the prior information, the initial model, and a plurality of the initial pose information; wherein, the minimum error value is characterized by an object specification error value and a reprojection error value The object specification error value represents the error between the structure information and/or size information corresponding to the obtained object model and the prior information; the reprojection error value represents the obtained object model in the corresponding target image The reprojection error between the projected point and the corresponding key point; the object model optimized when the minimum error value is obtained is determined as the target model, and the pose optimized when the minimum error value is obtained is determined as The target pose. Here, an implementation manner is provided that can simultaneously determine the target model and the target pose.
可选地,所述最小误差值可以基于以下步骤得到:在检测到误差值小于误差阈值时,将该误差值确定为所述最小误差值;或者在迭代次数达到迭代上限时,将多个误差值中最小的误差值确定为所述最小误差值;其中,达到迭代上限时所对应的迭代次数阈值与所述待处理图像集中包括所述目标图像的数量匹配。这里,提供了确定出最小误差值的两种实施方式,在实际应用过程中可以择一使用。Optionally, the minimum error value can be obtained based on the following steps: when it is detected that the error value is smaller than the error threshold, the error value is determined as the minimum error value; or when the number of iterations reaches the upper limit of iterations, a plurality of errors The minimum error value among the values is determined as the minimum error value; wherein, the iteration number threshold corresponding to the iteration upper limit is matched with the number of the target image included in the image set to be processed. Here, two implementation manners for determining the minimum error value are provided, and one can be used in an actual application process.
可选地,所述基于所述先验信息、所述初始模型以及多个所述初始位姿信息,计算最小误差值,可以包括:针对每一次迭代计算,基于所述先验信息以及当次优化得到的所述目标对象的当次模型,计算当次对象规格误差值;以及基于所述当次模型以及当次对应的初始位姿信息,计算当次重投影误差值;以及基于所述当次对象规格误差值和所述当次重投影误差值,确定所述最小误差值。这样,可以基于对象规格误差值以及重投影误差值,确定出最小误差值,以使得确定出的目标模型与目标对象更加贴近,确定出的目标位姿与目标对象在目标图像拍摄时刻的实际位姿更加贴近。Optionally, the calculating the minimum error value based on the prior information, the initial model, and a plurality of initial pose information may include: for each iteration calculation, based on the prior information and the current optimizing the current model of the target object obtained, and calculating the current object specification error value; and based on the current model and the current corresponding initial pose information, calculating the current reprojection error value; and based on the current time The second object specification error value and the current reprojection error value determine the minimum error value. In this way, the minimum error value can be determined based on the object specification error value and the reprojection error value, so that the determined target model is closer to the target object, and the determined target pose is consistent with the actual position of the target object at the time when the target image is taken. posture is closer.
可选地,所述基于所述当次对象规格误差值和所述当次重投影误差值,确定所述最小误差值,可以包括:将所述当次对象规格误差值与所述当次重投影误差值所对应的乘积或 和,确定为当次迭代计算得到的误差值;以及基于所述当次迭代计算得到的误差值以及历史迭代计算得到的误差值,确定所述最小误差值。这里,提供了一种可以确定最小误差值的方式。Optionally, the determining the minimum error value based on the current object specification error value and the current reprojection error value may include: combining the current object specification error value with the current reprojection error value The product or sum corresponding to the projection error value is determined as the error value calculated by the current iteration; and the minimum error value is determined based on the error value calculated by the current iteration and the error value calculated by the previous iteration. Here, a way is provided in which the minimum error value can be determined.
可选地,所述目标关键点信息包括目标关键点的位置信息;以及所述针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿信息,可以包括:基于所述目标关键点的位置信息以及相机标定信息,确定该目标关键点在世界坐标系下的初始位置信息;所述世界坐标系包括以所述目标对象的运动平面作为坐标面的坐标系;确定与所述初始位置信息匹配的初始偏航角信息;基于所述初始位置信息和所述初始偏航角信息,确定所述目标对象的初始位姿信息。以利于确定出更加准确的目标位姿信息。Optionally, the target key point information includes position information of the target key point; and for each of the target images, based on the target key point information of the target image, determining the initial pose information of the target object, It may include: based on the position information of the target key point and the camera calibration information, determine the initial position information of the target key point in the world coordinate system; the world coordinate system includes the motion plane of the target object as the coordinate plane A coordinate system; determining initial yaw angle information matching the initial position information; determining initial pose information of the target object based on the initial position information and the initial yaw angle information. In order to help determine more accurate target pose information.
可选地,所述确定与所述初始位置信息匹配的初始偏航角信息,可以包括:在目标角度范围内,选取多个角度与所述初始位置信息进行匹配,确定所述目标关键点对应的模型关键点在各个角度下分别对应的重投影误差值;将最小的重投影误差值对应的角度信息确定为所述初始偏航角信息。以使得初始偏航角信息更加接近于目标偏航角信息。Optionally, the determining the initial yaw angle information matching the initial position information may include: within the target angle range, selecting a plurality of angles to match the initial position information, and determining that the target key point corresponds to The reprojection error values corresponding to the key points of the model at various angles; the angle information corresponding to the smallest reprojection error value is determined as the initial yaw angle information. In order to make the initial yaw angle information closer to the target yaw angle information.
可选地,所述初始模型可以预先基于如下步骤确定:针对对象模型库中的每一个对象模型,确定表征该对象模型的模型关键点在目标图像中的投影点;以及确定所述投影点与所述目标图像的对应关键点之间的重投影误差值;将最小的重投影误差值所对应的对象模型确定为所述初始模型。这样,可以使得初始模型能够更加贴近于目标模型,加快确定目标模型的速率。Optionally, the initial model may be pre-determined based on the following steps: for each object model in the object model library, determine the projection point of the model key point representing the object model in the target image; and determine the projection point and Reprojection error values between corresponding key points of the target image; determining the object model corresponding to the smallest reprojection error value as the initial model. In this way, the initial model can be closer to the target model, and the rate of determining the target model can be accelerated.
可选地,所述待处理图像集可以包括从所述目标对象的运动路径中确定的多个目标图像。这样,通过能够表征目标对象运动轨迹的多个目标图像,可以得到较为接近的初始位姿信息。Optionally, the set of images to be processed may include a plurality of target images determined from the moving path of the target object. In this way, relatively close initial pose information can be obtained through multiple target images that can represent the motion trajectory of the target object.
本申请实施例还提供了一种位姿确定装置,该装置可以包括:获取模块,被配置成用于获取待处理图像集;所述待处理图像集中包括目标对象对应的多个目标图像;第一确定模块,被配置成用于针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿信息;第二确定模块,被配置成用于确定所述目标对象的初始模型;第三确定模块,被配置成用于基于所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。The embodiment of the present application also provides a device for determining a pose, which may include: an acquisition module configured to acquire an image set to be processed; the image set to be processed includes a plurality of target images corresponding to a target object; A determining module, configured to determine the initial pose information of the target object based on the target key point information of the target image for each target image; a second determining module, configured to determine the an initial model of the target object; a third determining module, configured to determine the target model of the target object and the initial pose information corresponding to each of the multiple target images based on the initial model and The target pose of the target object.
本申请实施例还提供了一种电子设备,所述电子设备包括处理器以及存储器,所述存储器存储有计算机可读取指令,当所述计算机可读取指令由所述处理器执行时,运行如上述提供的所述方法中的步骤。The embodiment of the present application also provides an electronic device, the electronic device includes a processor and a memory, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the operation The steps in the method are as provided above.
本申请实施例还提供了一种可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时运行如上述提供的所述方法中的步骤。The embodiment of the present application also provides a readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the method provided above are performed.
本申请实施例还提供了一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,所述计算机能够执行如上述提供的所述方法中的步骤。The embodiment of the present application also provides a computer program product, the computer program product includes a computer program, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer can execute the above-mentioned steps in the method described above.
本申请的其他特征和优点将在随后的说明书阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请实施例了解。本申请的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。Other features and advantages of the present application will be set forth in the ensuing description and, in part, will be apparent from the description, or can be learned by practicing the embodiments of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
附图说明Description of drawings
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本申请的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the accompanying drawings that need to be used in the embodiments of the present application will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the present application, so It should not be regarded as a limitation on the scope, and those skilled in the art can also obtain other related drawings according to these drawings without creative work.
图1为本申请实施例提供的一种位姿确定方法的流程图;FIG. 1 is a flow chart of a pose determination method provided in an embodiment of the present application;
图2为本申请实施例提供的另一种位姿确定方法的流程图;FIG. 2 is a flow chart of another pose determination method provided in the embodiment of the present application;
图3为本申请实施例提供的一种位姿确定装置的结构框图;Fig. 3 is a structural block diagram of a pose determination device provided by an embodiment of the present application;
图4为本申请实施例提供的一种用于执行位姿确定方法的电子设备的结构示意图。Fig. 4 is a schematic structural diagram of an electronic device for performing a pose determination method provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中附图,对本申请实施例中的技术方案进行清楚、完整地描述。The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application.
近年来,基于人工智能的计算机视觉、深度学习、机器学习、图像处理、图像识别等技术研究取得了重要进展。人工智能(Artificial Intelligence,AI)是研究、开发用于模拟、延伸人的智能的理论、方法、技术及应用系统的新兴科学技术。人工智能学科是一门综合性学科,涉及芯片、大数据、云计算、物联网、分布式存储、深度学习、机器学习、神经网络等诸多技术种类。计算机视觉作为人工智能的一个重要分支,具体是让机器识别世界,计算机视觉技术通常包括人脸识别、活体检测、指纹识别与防伪验证、生物特征识别、人脸检测、行人检测、目标检测、行人识别、图像处理、图像识别、图像语义理解、图像检索、文字识别、视频处理、视频内容识别、行为识别、三维重建、虚拟现实、增强现实、同步定位与地图构建(SLAM)、计算摄影、机器人导航与定位等技术。随着人工智能技术的研究和进步,该项技术在众多领域展开了应用,例如安防、城市管理、交通管理、楼宇管理、园区管理、人脸通行、人脸考勤、物流管理、仓储管理、机器人、智能营销、计算摄影、手机影像、云服务、智能家居、穿戴设备、无人驾驶、自动驾驶、智能医疗、人脸支付、人脸解锁、指纹解锁、人证核验、智慧屏、智能电视、摄像机、移动互联网、网络直播、美颜、美妆、医疗美容、智能测温等领域。In recent years, artificial intelligence-based computer vision, deep learning, machine learning, image processing, image recognition and other technologies have made important progress. Artificial Intelligence (AI) is an emerging science and technology that studies and develops theories, methods, technologies and application systems for simulating and extending human intelligence. The subject of artificial intelligence is a comprehensive subject that involves many technologies such as chips, big data, cloud computing, Internet of Things, distributed storage, deep learning, machine learning, and neural networks. As an important branch of artificial intelligence, computer vision is specifically to allow machines to recognize the world. Computer vision technology usually includes face recognition, liveness detection, fingerprint recognition and anti-counterfeiting verification, biometric recognition, face detection, pedestrian detection, target detection, pedestrian detection, etc. Recognition, image processing, image recognition, image semantic understanding, image retrieval, text recognition, video processing, video content recognition, behavior recognition, 3D reconstruction, virtual reality, augmented reality, simultaneous localization and map construction (SLAM), computational photography, robotics Navigation and positioning technologies. With the research and progress of artificial intelligence technology, this technology has been applied in many fields, such as security, urban management, traffic management, building management, park management, face access, face attendance, logistics management, warehouse management, robots , smart marketing, computational photography, mobile imaging, cloud services, smart home, wearable devices, unmanned driving, automatic driving, smart medical care, face payment, face unlock, fingerprint unlock, witness verification, smart screen, smart TV, Cameras, mobile Internet, webcasting, beauty, cosmetics, medical beauty, intelligent temperature measurement and other fields.
相关技术中,存在在目标对象的姿态估计过程中对象模型无法预先生成以及目标对象与目标模型之间匹配困难的问题。为了解决上述问题,本申请提供一种位姿确定方法、装置、电子设备和可读存储介质。通过首先获取与目标模型存在差别的初始模型,然后基于该初始模型与多个目标图像中分别示出的目标对象的初始位姿信息,同时确定出目标对象的目标模型和目标位姿,以使得不用预先生成目标模型以及无需设计目标对象与目标模型之间的匹配算法,从而解决了上述问题。在实际应用时,本申请可以应用于诸如车辆、无人机等对应的位姿估计过程。示例性地,本申请以应用于车辆的位姿估计过程为例阐述该位姿确定方法。也即,上述目标对象可以包括目标车辆。而保证目标车辆的目标位姿的准确性是必要的。例如在智能交通监控领域中,可以通过确定出的目标位姿进行诸如统计车流量、判断驾驶员是否违规驾驶等情况。In the related art, there are problems that the object model cannot be generated in advance and the matching between the target object and the target model is difficult in the pose estimation process of the target object. In order to solve the above problems, the present application provides a pose determination method, device, electronic equipment and readable storage medium. By first obtaining an initial model that is different from the target model, and then based on the initial model and the initial pose information of the target object shown in multiple target images, simultaneously determine the target model and target pose of the target object, so that The above-mentioned problems are solved without pre-generating the target model and without designing a matching algorithm between the target object and the target model. In practical applications, this application can be applied to the corresponding pose estimation process of vehicles, drones, etc. Exemplarily, the present application takes the pose estimation process applied to a vehicle as an example to illustrate the pose determination method. That is, the aforementioned target object may include a target vehicle. It is necessary to ensure the accuracy of the target pose of the target vehicle. For example, in the field of intelligent traffic monitoring, situations such as counting traffic flow and judging whether the driver is driving illegally can be performed through the determined target pose.
以上相关技术中的方案所存在的缺陷,均是发明人在经过实践并仔细研究后得出的结果,因此,上述问题的发现过程以及下文中本申请实施例针对上述问题所提出的解决方案,都应该是发明人在本申请过程中对本申请做出的贡献。The defects in the solutions in the above related technologies are all the results obtained by the inventor after practice and careful research. Therefore, the discovery process of the above problems and the solutions to the above problems proposed by the embodiments of the present application below, All should be the inventor's contribution to the application during the application process.
请参考图1,其示出了本申请实施例提供的第一种位姿确定方法的流程图。如图1所示,该位姿确定方法可以包括以下步骤101至步骤104。Please refer to FIG. 1 , which shows a flowchart of a first method for determining a pose provided by an embodiment of the present application. As shown in FIG. 1 , the pose determination method may include the following steps 101 to 104 .
步骤101,获取待处理图像集;所述待处理图像集中包括目标对象对应的多个目标图像; Step 101, acquiring a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to a target object;
上述目标图像例如可以包括目标货车、目标面包车、目标轿车等目标对象所对应的图像。The aforementioned target image may include, for example, images corresponding to target objects such as a target truck, a target van, and a target car.
进一步的,可以在获取了目标对象的多个图像之后,对这些图像进行整理,以得到上述待处理图像集。在一些应用场景中,上述目标图像可以通过网络资源获得,也可以通过在实地利用相机拍摄得到,还可以通过预先录制的录像获取。这里,在通过录像获取时,可以将截取的包括目标对象的图像视为目标图像。Further, after acquiring multiple images of the target object, these images may be sorted to obtain the above image set to be processed. In some application scenarios, the above-mentioned target image can be obtained through network resources, can also be obtained by using a camera on the spot, and can also be obtained through pre-recorded video. Here, the intercepted image including the target object may be regarded as the target image when it is acquired through video recording.
在一些可选的实现方式中,所述待处理图像集可以包括从目标对象的运动路径中确定的多个目标图像。In some optional implementation manners, the set of images to be processed may include multiple target images determined from a moving path of the target object.
在一些应用场景中,可以在目标对象的运动路径中拍摄多个图像,拍摄的图像可以视为目标图像。这样,拍摄得到的多个目标图像可以用于表征目标对象的运动轨迹。通过能够表征目标对象运动轨迹的多个目标图像,可以得到较为接近的初始位姿信息。In some application scenarios, multiple images may be captured in the moving path of the target object, and the captured images may be regarded as target images. In this way, the multiple captured target images can be used to characterize the movement trajectory of the target object. The closer initial pose information can be obtained through multiple target images that can represent the motion trajectory of the target object.
步骤102,针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿信息; Step 102, for each target image, determine the initial pose information of the target object based on the target key point information of the target image;
获取到上述待处理图像集之后,可以针对待处理图像集中的每个目标图像的目标关键点信息确定出目标对象的初始位姿信息。After the above image set to be processed is acquired, the initial pose information of the target object can be determined for the target key point information of each target image in the image set to be processed.
上述目标关键点信息可以视为目标图像中能够用于表征目标对象的位置的关键点的信息。关键点例如可以包括车辆前车标、车辆左视镜、车辆右视镜等在目标车辆图像中对应的点。在一些应用场景中,可以在上述的诸如车辆前车标、车辆左视镜、车辆右视镜等关键点中选择一个或多个作为目标关键点,目标关键点对应的图像坐标信息可以视为上述目标关键点信息。这里的图像坐标信息例如可以为(u,v)。这里的坐标参数“u”、“v”可以为在图像坐标系下的任意值。The above target key point information can be regarded as information of key points in the target image that can be used to characterize the position of the target object. The key points may include, for example, corresponding points in the image of the target vehicle, such as the front logo of the vehicle, the left-view mirror of the vehicle, and the right-view mirror of the vehicle. In some application scenarios, one or more of the above-mentioned key points such as the vehicle front logo, vehicle left-view mirror, and vehicle right-view mirror can be selected as target key points, and the image coordinate information corresponding to the target key points can be regarded as The key point information of the above target. The image coordinate information here may be (u, v), for example. The coordinate parameters "u" and "v" here can be any value in the image coordinate system.
上述初始位姿信息可以视为能够粗糙表征目标对象在实际应用场景中的位置信息和姿态信息。该初始位姿信息例如可以通过目标对象在世界坐标系下的坐标信息(X,Y,θ)表征。可选地,上述坐标参数“X”、“Y”、“θ”可以为在该世界坐标系下对应的任意值。可选地,“θ”可以视为目标对象的姿态信息。The above initial pose information can be regarded as being able to roughly represent the position information and attitude information of the target object in the actual application scene. The initial pose information may be represented by coordinate information (X, Y, θ) of the target object in the world coordinate system, for example. Optionally, the above-mentioned coordinate parameters "X", "Y", and "θ" may be any corresponding values in the world coordinate system. Optionally, "θ" can be regarded as the pose information of the target object.
步骤103,确定所述目标对象的初始模型; Step 103, determining the initial model of the target object;
上述初始模型例如可以是预先确定的与目标对象相类似的模型。该模型可以与目标模型对应的尺寸相差较大,继而无需预先生成目标模型。例如,目标对象为轿车,初始模型例如可以为面包车模型。在一些应用场景中,可以预先准备车辆模型库,该车辆模型库例如可以实际应用中的各种车型为参照设计。The aforementioned initial model may be, for example, a predetermined model similar to the target object. The size of the model can be quite different from the size corresponding to the target model, so that the target model does not need to be generated in advance. For example, the target object is a car, and the initial model can be, for example, a van model. In some application scenarios, a vehicle model library may be prepared in advance, and the vehicle model library may, for example, be designed with reference to various vehicle types in actual applications.
在一些可选的实现方式中,所述初始模型预先基于如下步骤确定:In some optional implementation manners, the initial model is pre-determined based on the following steps:
步骤A,针对对象模型库中的每一个对象模型,确定表征该对象模型的模型关键点在目标图像中的投影点;Step A, for each object model in the object model library, determine the projection point of the model key point representing the object model in the target image;
上述对象模型库例如可以为上述车辆模型库。相应的,上述模型关键点例如可以包括车辆模型的车标、左视镜等实质上能够表征对象模型的关键点。The above-mentioned object model library may be, for example, the above-mentioned vehicle model library. Correspondingly, the aforementioned key points of the model may include, for example, key points of the vehicle model that can substantially represent the object model, such as the logo of the vehicle model and the left-view mirror.
例如,针对车辆模型库中的每一个车辆模型,可以确定表征该车辆模型的前车标关键点在目标车辆图像中的投影点。For example, for each vehicle model in the vehicle model library, the projection point of the key point of the front vehicle logo representing the vehicle model in the target vehicle image may be determined.
进一步的,上述投影点例如可以通过投影公式得到。例如,已知目标车辆模型信息(x w,y w,z w)以及其位姿信息(X,Y,θ),可以利用投影公式计算出投影点对应的图像坐标为(u,v)。在这些应用场景中,上述u、v可以表征在其所属坐标系下的任意数字,上述x w、y w、z w可以分别表征目标车辆模型的长度信息、宽度信息以及高度l信息。上述X,Y可以为在其所属坐标系下的任意值,上述θ可以为(0,2π)内的任意值。上述投影公式例如可以包括:
Figure PCTCN2022105549-appb-000001
其中,λ为尺度因子,K为相机的内参数,P为相机的外参数。
Further, the above projection point can be obtained, for example, by a projection formula. For example, given the target vehicle model information (x w , y w , z w ) and its pose information (X, Y, θ), the image coordinates corresponding to the projection point can be calculated as (u, v) using the projection formula. In these application scenarios, the above-mentioned u and v can represent any number in the coordinate system to which they belong, and the above-mentioned x w , y w , and z w can respectively represent the length information, width information and height l information of the target vehicle model. The above-mentioned X, Y can be any value in the coordinate system to which they belong, and the above-mentioned θ can be any value within (0, 2π). The projection formula above may include, for example:
Figure PCTCN2022105549-appb-000001
Among them, λ is the scale factor, K is the internal parameter of the camera, and P is the external parameter of the camera.
步骤B,确定所述投影点与所述目标图像的对应关键点之间的重投影误差值;Step B, determining the reprojection error value between the projection point and the corresponding key point of the target image;
确定了上述投影点之后,可以计算该投影点与该投影点在目标图像中的对应关键点之间的重投影误差值。例如,确定了利用车辆前车标作为车辆模型关键点A之后,可以基于上述投影公式得到该模型关键点A在目标图像中的投影点A'。然后可以确定投影点A'与其对应关键点a之间的重投影误差值。在一些应用场景中,例如可以通过最小二乘法计算上述重投影误差值。这里,利用最小二乘法计算该重投影误差值的方式为相关技术,此处不赘述。After the above projection point is determined, the reprojection error value between the projection point and the corresponding key point in the target image can be calculated. For example, after it is determined to use the vehicle front logo as the vehicle model key point A, the projection point A' of the model key point A in the target image can be obtained based on the above projection formula. The reprojection error value between projected point A' and its corresponding keypoint a can then be determined. In some application scenarios, for example, the above reprojection error value may be calculated by the least square method. Here, the manner of calculating the reprojection error value by using the least square method is a related technology, which will not be described in detail here.
步骤C,将最小的重投影误差值所对应的对象模型确定为所述初始模型。Step C, determining the object model corresponding to the smallest reprojection error value as the initial model.
确定了各个对象模型分别对应的重投影误差值之后,可以比较各个重投影误差值的大小,继而可以确定出最小的重投影误差值。After the reprojection error values corresponding to each object model are determined, the magnitudes of each reprojection error value can be compared, and then the smallest reprojection error value can be determined.
确定了最小的重投影误差值之后,可以将该最小的重投影误差值对应的对象模型确定为上述初始模型。例如,确定了车辆模型A对应的重投影误差值最小时,可以将车辆模型A确定为初始模型。After the minimum reprojection error value is determined, the object model corresponding to the minimum reprojection error value may be determined as the above initial model. For example, when it is determined that the reprojection error value corresponding to vehicle model A is the smallest, vehicle model A may be determined as the initial model.
通过上述步骤A至步骤C,可以通过计算的重投影误差值确定出初始模型。使得初始模型能够更加贴近于目标模型,加快确定目标模型的速率。Through the above steps A to C, the initial model can be determined through the calculated reprojection error value. This makes the initial model closer to the target model and speeds up the determination of the target model.
步骤104,基于所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。Step 104: Determine a target model of the target object and a target pose of the target object based on the initial model and the initial pose information corresponding to each of the plurality of target images.
确定了初始模型以及多个初始位姿信息之后,例如可以基于不同的初始位姿信息利用最小二乘法不断迭代优化,以得到上述目标模型和目标位姿。在一些应用场景中,上述目标模型例如可以应用于测量车辆的长度信息、高度信息和/或宽度信息等。After the initial model and multiple pieces of initial pose information are determined, for example, based on different initial pose information, the least square method can be used for continuous iterative optimization to obtain the above-mentioned target model and target pose. In some application scenarios, the above target model may be applied to measure length information, height information, and/or width information of the vehicle, for example.
通过上述步骤101至步骤104,可以同时确定出目标模型和目标位姿,使得不用预先生成对象模型以及无需设计目标对象与对象模型之间的匹配算法,优化了位姿估计过程的处理步骤。Through the above steps 101 to 104, the target model and the target pose can be determined at the same time, so that there is no need to pre-generate the object model and design a matching algorithm between the target object and the object model, and optimize the processing steps of the pose estimation process.
请参考图2,其示出了本申请实施例提供的另一种图像处理方法的流程图。如图2所示,该图像处理方法可以包括以下步骤201至步骤205。Please refer to FIG. 2 , which shows a flowchart of another image processing method provided by an embodiment of the present application. As shown in FIG. 2 , the image processing method may include the following steps 201 to 205 .
步骤201,获取待处理图像集;所述待处理图像集中包括目标对象对应的多个目标图像; Step 201, acquiring a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to a target object;
上述步骤201的实现过程以及取得的技术效果可以与图1所示实施例的步骤101相同或相似,此处不赘述。The implementation process of the above step 201 and the obtained technical effect may be the same as or similar to the step 101 of the embodiment shown in FIG. 1 , and details are not described here.
步骤202,针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿信息; Step 202, for each target image, based on the target key point information of the target image, determine the initial pose information of the target object;
上述步骤202的实现过程以及取得的技术效果可以与图1所示实施例的步骤102相同 或相似,此处不赘述。The implementation process of the above-mentioned step 202 and the technical effect obtained may be the same as or similar to the step 102 of the embodiment shown in FIG. 1 , and will not be repeated here.
步骤203,确定所述目标对象的初始模型; Step 203, determining the initial model of the target object;
上述步骤203的实现过程以及取得的技术效果可以与图1所示实施例的步骤103相同或相似,此处不赘述。The implementation process of the above step 203 and the obtained technical effect may be the same as or similar to the step 103 of the embodiment shown in FIG. 1 , and details are not described here.
步骤204,获取与所述目标对象相匹配的先验信息;所述先验信息用于表征所述目标对象的结构信息和/或尺寸信息; Step 204, acquiring prior information matching the target object; the prior information is used to characterize the structure information and/or size information of the target object;
在一些应用场景中,可以获取目标对象的先验信息。上述先验信息例如可以包括目标对象的诸如长度信息、宽度信息、高度信息等尺寸信息,以及车牌与前车标之间的共面信息、左视镜与右视镜之间的对称信息等结构信息。In some application scenarios, prior information of the target object can be obtained. The above-mentioned prior information may include, for example, size information of the target object such as length information, width information, and height information, as well as coplanar information between the license plate and the front logo, and symmetry information between the left-view mirror and the right-view mirror. information.
步骤205,基于所述先验信息、所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。Step 205: Determine the target model of the target object and the target pose of the target object based on the prior information, the initial model, and the initial pose information corresponding to each of the plurality of target images.
获取到先验信息、初始模型以及多个初始位姿信息之后,可以同时确定出目标模型和目标位姿。After obtaining the prior information, the initial model, and multiple initial pose information, the target model and target pose can be determined simultaneously.
实践中,在获取目标图像时,常常存在缺乏目标对象在各个角度的图像信息,继而使得图像中的一些关键点由于获取视角的不同导致优化精度有限的问题。通过上述步骤201至步骤205,可以通过添加先验信息约束目标对象的结构和/或尺寸,提高优化精度,使得到的目标模型、目标位姿更加准确。In practice, when the target image is acquired, there is often a lack of image information of the target object at various angles, and then some key points in the image have limited optimization accuracy due to different acquisition angles of view. Through the above steps 201 to 205, the structure and/or size of the target object can be constrained by adding prior information to improve the optimization accuracy and make the obtained target model and target pose more accurate.
在一些可选的实现方式中,可以通过以下子步骤同时确定出目标模型和目标位姿:In some optional implementations, the target model and target pose can be determined simultaneously through the following sub-steps:
子步骤2051,基于所述先验信息、所述初始模型以及多个所述初始位姿信息,计算最小误差值;其中,所述最小误差值通过对象规格误差值和重投影误差值进行表征;所述对象规格误差值表征得到的对象模型所对应的结构信息和/或尺寸信息与所述先验信息之间的误差;所述重投影误差值表征得到的对象模型在对应的目标图像中的投影点与对应关键点之间的重投影误差;Sub-step 2051, calculate a minimum error value based on the prior information, the initial model, and a plurality of the initial pose information; wherein, the minimum error value is characterized by an object specification error value and a reprojection error value; The object specification error value represents an error between the structure information and/or size information corresponding to the obtained object model and the prior information; the reprojection error value represents the difference between the obtained object model in the corresponding target image The reprojection error between the projected point and the corresponding keypoint;
在一些应用场景中,可以基于多个初始位姿信息进行迭代计算,得到最小误差值。例如可以利用非线性最小二乘法计算最小误差值。在这些应用场景中,最小误差值可以通过对象规格误差值和重投影误差值进行表征。例如,可以通过将对象规格误差值与重投影误差值相乘得到的最小乘积作为最小误差值;也可以将对象规格误差值与重投影误差值相加得到的最小和作为最小误差值。这里,对象规格误差值与重投影误差值相加得到的和,可以是两者的代数和,也可以是两者的加权和,具体可以实际情况进行选择,此处不限定。上述加权和,例如可以包括为两者分配不同的权重值实现。In some application scenarios, iterative calculation can be performed based on multiple initial pose information to obtain the minimum error value. For example, the minimum error value can be calculated using a non-linear least squares method. In these application scenarios, the minimum error value can be characterized by object specification error value and reprojection error value. For example, the minimum product obtained by multiplying the object specification error value and the reprojection error value can be used as the minimum error value; or the minimum sum obtained by adding the object specification error value and the reprojection error value can be used as the minimum error value. Here, the sum obtained by adding the object specification error value and the reprojection error value can be the algebraic sum of the two, or the weighted sum of the two, which can be selected according to the actual situation, and is not limited here. The aforementioned weighted sum may, for example, be realized by assigning different weight values to the two.
在一些可选的实现方式中,上述子步骤2051可以包括以下步骤:In some optional implementation manners, the above substep 2051 may include the following steps:
步骤一,针对每一次迭代计算,基于所述先验信息以及当次优化得到的所述目标对象 的当次模型,计算当次对象规格误差值;Step 1, for each iterative calculation, based on the prior information and the current model of the target object obtained by the current optimization, calculate the current object specification error value;
在一些应用场景中,针对每一次迭代计算,可以基于先验信息以及当前得到的目标对象的当次模型,确定当次对象规格误差值。上述当次对象规格误差值例如可以视为表征当次得到的目标模型与先验信息之间的误差。例如,目标车辆的先验信息为:高度为3米、宽度为2米、长度为5米,此时,若得到的对象模型表征目标车辆:高度为3米、宽度为1.8米、长度为5米,可以计算当次对象规格误差值为:高度误差为0米、宽度误差为0.2米、长度误差为0米。In some application scenarios, for each iterative calculation, the specification error value of the current object can be determined based on the prior information and the currently obtained current model of the target object. The above current target specification error value can be regarded as representing the error between the target model obtained at the current time and the prior information, for example. For example, the prior information of the target vehicle is: the height is 3 meters, the width is 2 meters, and the length is 5 meters. At this time, if the obtained object model represents the target vehicle: the height is 3 meters, the width is 1.8 meters, and the length is 5 meters. Meters, you can calculate the current object specification error values: the height error is 0 meters, the width error is 0.2 meters, and the length error is 0 meters.
步骤二,基于所述当次模型以及当次对应的初始位姿信息,计算当次重投影误差值;Step 2, based on the current model and the current corresponding initial pose information, calculate the current reprojection error value;
在一些应用场景中,针对每一次迭代计算,可以基于当前得到的对象模型以及当次对应的初始位姿信息,确定当次对应的重投影误差值。例如,可以确定当次对应的初始位姿信息所对应的目标图像,然后可以确定能够表征当前得到的对象模型的模型关键点对应的投影点,以及计算该投影点与目标图像的对应关键点之间的重投影误差值。这里,可以参照上述实施例中步骤B的相关部分,此处不赘述。In some application scenarios, for each iterative calculation, the current corresponding reprojection error value can be determined based on the currently obtained object model and the current corresponding initial pose information. For example, the target image corresponding to the current corresponding initial pose information can be determined, and then the projection point corresponding to the key point of the model that can characterize the currently obtained object model can be determined, and the relationship between the projection point and the corresponding key point of the target image can be calculated. The reprojection error value between . Here, reference may be made to relevant parts of step B in the foregoing embodiments, and details are not repeated here.
步骤三,基于所述当次对象规格误差值和所述当次重投影误差值,确定所述最小误差值。Step 3: Determine the minimum error value based on the current object specification error value and the current reprojection error value.
确定了当次对象规格误差值和当次重投影误差值之后,可以确定上述最小误差值。例如,可以为当次规格误差值以及当前重投影误差值分配不同的权重,确定上述最小误差值。After the object specification error value of the current time and the reprojection error value of the current time are determined, the above minimum error value can be determined. For example, different weights may be assigned to the current specification error value and the current reprojection error value to determine the aforementioned minimum error value.
相关技术中,存在重投影误差值较小,但是得到的位姿却与实际位姿差别较大的情况。因此,本实施例通过上述子步骤一至子步骤三引入了对象规格误差值,继而可以基于对象规格误差值以及重投影误差值,确定出最小误差值,以使得确定出的目标模型与目标对象更加贴近,确定出的目标位姿与目标对象在目标图像拍摄时刻的实际位姿更加贴近。In the related art, there are cases where the reprojection error value is small, but the obtained pose is quite different from the actual pose. Therefore, this embodiment introduces the object specification error value through the above sub-steps 1 to 3, and then the minimum error value can be determined based on the object specification error value and the reprojection error value, so that the determined target model is closer to the target object. Closeness, the determined target pose is closer to the actual pose of the target object at the moment when the target image is captured.
在一些可选的实现方式中,上述步骤三可以包括:首先,将所述当次对象规格误差值与所述当次重投影误差值所对应的乘积或和,确定为当次迭代计算得到的误差值;In some optional implementation manners, the above step three may include: firstly, determining the product or sum corresponding to the current object specification error value and the current reprojection error value as the current iteration calculation difference;
也就是说,在一些应用场景中,可以将当次规格误差值与当次重投影误差值进行累加,得到两者之和,继而可以得到当前迭代时所对应的误差值。在另一些应用场景中,也可以将当次规格误差值与当次重投影误差值相乘,得到两者之积,继而可以得到当前迭代时所对应的误差值。That is to say, in some application scenarios, the current specification error value and the current reprojection error value can be accumulated to obtain the sum of the two, and then the error value corresponding to the current iteration can be obtained. In other application scenarios, the current specification error value can also be multiplied by the current reprojection error value to obtain the product of the two, and then the error value corresponding to the current iteration can be obtained.
然后,基于所述当次迭代计算得到的误差值以及历史迭代计算得到的误差值,确定所述最小误差值。Then, the minimum error value is determined based on the error value calculated by the current iteration and the error value calculated by historical iterations.
每一次迭代计算可以对应有误差值,在得到当次迭代对应的误差值之后,可以将该误差值与历史迭代计算得到的误差值进行比较,确定出当前对应的最小误差值。Each iterative calculation can correspond to an error value. After obtaining the error value corresponding to the current iteration, the error value can be compared with the error value obtained by historical iterative calculations to determine the current corresponding minimum error value.
子步骤2052,将得到所述最小误差值时优化得到的对象模型确定为所述目标模型,以 及将得到所述最小误差值时优化得到的位姿确定为所述目标位姿。In sub-step 2052, the object model optimized when the minimum error value is obtained is determined as the target model, and the pose optimized when the minimum error value is obtained is determined as the target pose.
得到最小误差之后,可以将得到最小误差值时所对应的对象模型确定为目标模型,并可以将此时对应的位姿确定为目标位姿。After the minimum error is obtained, the object model corresponding to the minimum error value can be determined as the target model, and the corresponding pose at this time can be determined as the target pose.
在一些可选的实现方式中,所述最小误差值基于以下步骤得到:在检测到误差值小于误差阈值时,将该误差值确定为所述最小误差值;或者在迭代次数达到迭代上限时,将多个误差值中最小的误差值确定为所述最小误差值;其中,达到迭代上限时所对应的迭代次数阈值与所述待处理图像集中包括所述目标图像的数量匹配。In some optional implementation manners, the minimum error value is obtained based on the following steps: when it is detected that the error value is less than an error threshold, the error value is determined as the minimum error value; or when the number of iterations reaches the upper limit of iterations, Determining the smallest error value among the plurality of error values as the minimum error value; wherein, when the iteration upper limit is reached, the corresponding iteration number threshold matches the number of the target image included in the image set to be processed.
在一些应用场景中,可以在检测到误差值小于误差阈值时停止迭代计算。进一步,可以将此时优化得到的对象模型确定为目标模型,并可以将此时对应得到的位姿信息确定为目标位姿信息。这样,可以使得在当前的目标模型与目标对象基本吻合,目标位姿与目标对象的实际位姿基本吻合的同时,减小迭代计算的计算量。在这些应用场景中,上述误差阈值例如可以包括0.08、0.1等实质上可以使误差值所对应的对象模型视为目标模型的数值。In some application scenarios, the iterative calculation may be stopped when it is detected that the error value is smaller than the error threshold. Further, the object model optimized at this time may be determined as the target model, and the corresponding pose information obtained at this time may be determined as the target pose information. In this way, the calculation amount of the iterative calculation can be reduced while the current target model basically matches the target object, and the target pose basically matches the actual pose of the target object. In these application scenarios, the error threshold may include, for example, 0.08, 0.1, etc., which can substantially make the object model corresponding to the error value be regarded as the target model.
在另一些应用场景中,也可以在达到最大迭代次数时停止迭代计算。例如,每一个目标图像所对应的初始位姿信息均被使用,本次迭代计算过程无法再继续时,可以视为达到最大迭代次数。此时得到的对象模型可以被确定为目标模型,并可以将当前得到的位姿信息确定为目标位姿。In other application scenarios, the iterative calculation can also be stopped when the maximum number of iterations is reached. For example, the initial pose information corresponding to each target image is used, and when the iterative calculation process can no longer continue, it can be regarded as reaching the maximum number of iterations. The object model obtained at this time can be determined as the target model, and the currently obtained pose information can be determined as the target pose.
在一些可选的实现方式中,所述目标关键点信息包括目标关键点的位置信息;以及上述图1所示实施例中步骤102或者图2所示实施例中步骤202可以包括以下子步骤:In some optional implementation manners, the target key point information includes position information of the target key point; and step 102 in the above embodiment shown in FIG. 1 or step 202 in the embodiment shown in FIG. 2 may include the following sub-steps:
子步骤1,基于所述目标关键点的位置信息以及相机标定信息,确定该目标关键点在世界坐标系下的初始位置信息;所述世界坐标系包括以所述目标对象的运动平面作为坐标面的坐标系;Sub-step 1, based on the position information of the target key point and the camera calibration information, determine the initial position information of the target key point in the world coordinate system; the world coordinate system includes the motion plane of the target object as the coordinate plane coordinate system;
上述相机标定信息可以包括相机的内参矩阵和外参矩阵,以对拍摄的图像进行矫正,得到畸变较小的目标图像。The above camera calibration information may include an internal reference matrix and an external reference matrix of the camera, so as to correct the captured image and obtain a target image with less distortion.
上述世界坐标系可以包括以目标对象的运动平面作为坐标面的坐标系。例如,可以将目标车辆的运动路面作为横纵坐标面的路面坐标系视为世界坐标系。The above-mentioned world coordinate system may include a coordinate system using the motion plane of the target object as a coordinate plane. For example, the road surface coordinate system in which the moving road surface of the target vehicle is taken as the horizontal and vertical coordinate plane can be regarded as the world coordinate system.
通过目标关键点的位置信息以及相机标定信息,可以确定该目标关键点在世界坐标系下的初始位置信息。例如,将车辆前车标作为目标关键点时,可以基于目标关键点的图像坐标(u,v)以及相机标定信息,利用投影公式确定出该目标关键点在世界坐标系下的初始位置信息(X,Y,Z)。当世界坐标系为上述路面坐标系时,该初始位置信息可以为(X,Y,0)。Through the position information of the target key point and the camera calibration information, the initial position information of the target key point in the world coordinate system can be determined. For example, when the vehicle front logo is used as the target key point, based on the image coordinates (u, v) of the target key point and the camera calibration information, the projection formula can be used to determine the initial position information of the target key point in the world coordinate system ( X, Y, Z). When the world coordinate system is the aforementioned road surface coordinate system, the initial position information may be (X, Y, 0).
子步骤2,确定与所述初始位置信息匹配的初始偏航角信息;Sub-step 2, determining the initial yaw angle information matching the initial position information;
确定了目标关键点在世界坐标系下的初始位置信息之后,可以继续确定与该初始位置信息相匹配的初始偏航角信息。After the initial position information of the target key point in the world coordinate system is determined, the initial yaw angle information matching the initial position information can be determined further.
在一些可选的实现方式中,上述初始偏航角信息可以基于如下步骤确定:首先,在目标角度范围内,选取多个角度与所述初始位置信息进行匹配,确定所述目标关键点对应的模型关键点在各个角度下分别对应的重投影误差值;In some optional implementations, the above initial yaw angle information can be determined based on the following steps: first, within the target angle range, select multiple angles to match the initial position information, and determine the target key point corresponding The reprojection error values corresponding to the key points of the model at each angle;
上述目标角度范围例如可以包括(0,2π)。The aforementioned target angle range may include (0, 2π), for example.
在一些应用场景中,可以在目标角度范围内选取多个角度分别与初始位置信息进行匹配。例如,可以等间隔选取90°、180°、270°、360°等分别与初始位置信息匹配。进一步的,在选取了90°作为与初始位置信息进行匹配的角度之后,上述模型关键点的位姿信息可以为(X,Y,90°),此时可以计算该模型关键点对应的投影点与目标关键点之间的重投影误差。相类似地,可以分别计算多个模型关键点在各自角度下分别对应的重投影误差。In some application scenarios, multiple angles can be selected within the target angle range to match with the initial position information respectively. For example, 90°, 180°, 270°, 360°, etc. may be selected at equal intervals to match the initial position information respectively. Further, after selecting 90° as the angle for matching with the initial position information, the pose information of the key points of the above model can be (X, Y, 90°), at this time, the projection point corresponding to the key point of the model can be calculated The reprojection error from the target keypoint. Similarly, the reprojection errors corresponding to multiple model key points at their respective angles can be calculated respectively.
然后,将最小的重投影误差值对应的角度信息确定为所述初始偏航角信息。Then, the angle information corresponding to the smallest reprojection error value is determined as the initial yaw angle information.
确定了多个重投影误差值之后,可以将最小的重投影误差值所对应的角度信息确定为初始偏航角信息,以使得初始偏航角信息更加接近于目标偏航角信息。例如,确定了位姿信息为(X,Y,90°)的模型关键点所对应的重投影误差值最小时,可以将90°确定为初始偏航角信息。After multiple reprojection error values are determined, angle information corresponding to the smallest reprojection error value may be determined as initial yaw angle information, so that the initial yaw angle information is closer to the target yaw angle information. For example, when it is determined that the reprojection error value corresponding to the key point of the model whose pose information is (X, Y, 90°) is the smallest, 90° may be determined as the initial yaw angle information.
子步骤3,基于所述初始位置信息和所述初始偏航角信息,确定所述目标对象的初始位姿信息。Sub-step 3, based on the initial position information and the initial yaw angle information, determine the initial pose information of the target object.
确定了初始位置信息以及初始偏航角信息之后,可以确定目标对象的初始位姿信息。例如,确定了初始位置信息(X,Y,0)以及初始偏航角信息90°时,可以确定出初始位姿信息(X,Y,90°)。After the initial position information and the initial yaw angle information are determined, the initial pose information of the target object can be determined. For example, when the initial position information (X, Y, 0) and the initial yaw angle information of 90° are determined, the initial pose information (X, Y, 90°) can be determined.
通过上述子步骤1至子步骤3,可以粗糙确定出目标对象的初始位姿信息,利于确定出更加准确的目标位姿信息。Through the above sub-steps 1 to 3, the initial pose information of the target object can be roughly determined, which is beneficial to determine more accurate target pose information.
请参考图3,其示出了本申请实施例提供的一种位姿确定装置的结构框图,该位姿确定装置可以是电子设备上的模块、程序段或代码。应理解,该装置与上述图1方法实施例对应,能够执行图1方法实施例涉及的各个步骤,该装置具体的功能可以参见上文中的描述,为避免重复,此处适当省略详细描述。Please refer to FIG. 3 , which shows a structural block diagram of an apparatus for determining a pose provided by an embodiment of the present application. The apparatus for determining a pose may be a module, program segment or code on an electronic device. It should be understood that the device corresponds to the above-mentioned method embodiment in FIG. 1 , and can execute various steps involved in the method embodiment in FIG. 1 . The specific functions of the device can refer to the description above. To avoid repetition, detailed descriptions are appropriately omitted here.
可选地,上述位姿确定装置可以包括获取模块301,第一确定模块302、第二确定模块303和第三确定模块304。其中,获取模块301,被配置成用于获取待处理图像集;所述待处理图像集中包括目标对象对应的多个目标图像;第一确定模块302,被配置成用于针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿 信息;第二确定模块303,被配置成用于确定所述目标对象的初始模型;第三确定模块304,被配置成用于基于所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。Optionally, the above-mentioned device for determining a pose may include an acquisition module 301 , a first determination module 302 , a second determination module 303 and a third determination module 304 . Wherein, the obtaining module 301 is configured to obtain a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to the target object; the first determination module 302 is configured to image, based on the target key point information of the target image, determine the initial pose information of the target object; the second determining module 303 is configured to determine the initial model of the target object; the third determining module 304 is configured to It is configured to determine a target model of the target object and a target pose of the target object based on the initial model and the initial pose information respectively corresponding to the plurality of target images.
可选地,所述位姿确定装置还可以包括信息获取模块,上述信息获取模块被配置成用于:获取与所述目标对象相匹配的先验信息;所述先验信息用于表征所述目标对象的结构信息和/或尺寸信息;所述第三确定模块304进一步被配置成用于:基于所述先验信息、所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。Optionally, the apparatus for determining pose and posture may further include an information acquisition module configured to: acquire prior information matching the target object; the prior information is used to characterize the Structural information and/or size information of the target object; the third determination module 304 is further configured to: based on the prior information, the initial model, and the initial position corresponding to each of the plurality of target images pose information, and determine the target model of the target object and the target pose of the target object.
可选地,所述第三确定模块304可以进一步被配置成用于:基于所述先验信息、所述初始模型以及多个所述初始位姿信息,计算最小误差值;其中,所述最小误差值通过对象规格误差值和重投影误差值进行表征;所述对象规格误差值表征得到的对象模型所对应的结构信息和/或尺寸信息与所述先验信息之间的误差;所述重投影误差值表征得到的对象模型在对应的目标图像中的投影点与对应关键点之间的重投影误差;将得到所述最小误差值时优化得到的对象模型确定为所述目标模型,以及将得到所述最小误差值时优化得到的位姿确定为所述目标位姿。Optionally, the third determining module 304 may be further configured to: calculate a minimum error value based on the prior information, the initial model, and a plurality of initial pose information; wherein, the minimum The error value is characterized by an object specification error value and a reprojection error value; the object specification error value represents the error between the structure information and/or size information corresponding to the obtained object model and the prior information; the reprojection The projection error value characterizes the reprojection error between the projected point of the object model in the corresponding target image and the corresponding key point; the object model optimized when the minimum error value is obtained is determined as the target model, and The pose optimized when the minimum error value is obtained is determined as the target pose.
可选地,所述最小误差值可以基于以下步骤得到:在检测到误差值小于误差阈值时,将该误差值确定为所述最小误差值;或者在迭代次数达到迭代上限时,将多个误差值中最小的误差值确定为所述最小误差值;其中,达到迭代上限时所对应的迭代次数阈值与所述待处理图像集中包括所述目标图像的数量匹配。Optionally, the minimum error value can be obtained based on the following steps: when it is detected that the error value is smaller than the error threshold, the error value is determined as the minimum error value; or when the number of iterations reaches the upper limit of iterations, a plurality of errors The minimum error value among the values is determined as the minimum error value; wherein, the iteration number threshold corresponding to the iteration upper limit is matched with the number of the target image included in the image set to be processed.
可选地,所述第三确定模块304可以进一步被配置成用于:针对每一次迭代计算,基于所述先验信息以及当次优化得到的所述目标对象的当次模型,计算当次对象规格误差值;以及基于所述当次模型以及当次对应的初始位姿信息,计算当次重投影误差值;以及基于所述当次对象规格误差值和所述当次重投影误差值,确定所述最小误差值。Optionally, the third determination module 304 may be further configured to: for each iterative calculation, based on the prior information and the current model of the target object obtained by current optimization, calculate the current object A specification error value; and based on the current model and the current corresponding initial pose information, calculate the current reprojection error value; and based on the current object specification error value and the current reprojection error value, determine The minimum error value.
可选地,所述第三确定模块304可以进一步被配置成用于:将所述当次对象规格误差值与所述当次重投影误差值所对应的乘积或和,确定为当次迭代计算得到的误差值;以及基于所述当次迭代计算得到的误差值以及历史迭代计算得到的误差值,确定所述最小误差值。Optionally, the third determination module 304 may be further configured to: determine the product or sum corresponding to the current object specification error value and the current reprojection error value as the current iterative calculation The obtained error value; and determining the minimum error value based on the error value calculated by the current iteration and the error value calculated by historical iterations.
可选地,所述第一确定模块302可以进一步被配置成用于:基于所述目标关键点的位置信息以及相机标定信息,确定该目标关键点在世界坐标系下的初始位置信息;所述世界坐标系包括以所述目标对象的运动平面作为坐标面的坐标系;确定与所述初始位置信息匹配的初始偏航角信息;基于所述初始位置信息和所述初始偏航角信息,确定所述目标对象的初始位姿信息。Optionally, the first determination module 302 may be further configured to: determine the initial position information of the target key point in the world coordinate system based on the position information of the target key point and the camera calibration information; The world coordinate system includes a coordinate system using the motion plane of the target object as a coordinate plane; determining initial yaw angle information matching the initial position information; based on the initial position information and the initial yaw angle information, determining The initial pose information of the target object.
可选地,所述第一确定模块302可以进一步被配置成用于:在目标角度范围内,选取多个角度与所述初始位置信息进行匹配,确定所述目标关键点对应的模型关键点在各个角度下分别对应的重投影误差值;将最小的重投影误差值对应的角度信息确定为所述初始偏航角信息。Optionally, the first determining module 302 may be further configured to: within the target angle range, select multiple angles to match the initial position information, and determine that the model key points corresponding to the target key points are at Re-projection error values corresponding to each angle; the angle information corresponding to the smallest re-projection error value is determined as the initial yaw angle information.
可选地,所述初始模型预先可以基于如下步骤确定:针对对象模型库中的每一个对象模型,确定表征该对象模型的模型关键点在目标图像中的投影点;以及确定所述投影点与所述目标图像的对应关键点之间的重投影误差值;将最小的重投影误差值所对应的对象模型确定为所述初始模型。Optionally, the initial model may be determined in advance based on the following steps: for each object model in the object model library, determine the projection point of the model key point representing the object model in the target image; and determine the projection point and Reprojection error values between corresponding key points of the target image; determining the object model corresponding to the smallest reprojection error value as the initial model.
可选地,所述待处理图像集可以包括从所述目标对象的运动路径中确定的多个目标图像。Optionally, the set of images to be processed may include a plurality of target images determined from the moving path of the target object.
需要说明的是,本领域技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再重复描述。It should be noted that those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the system and device described above can refer to the corresponding process in the foregoing method embodiment, and the description will not be repeated here .
请参照图4,图4为本申请实施例提供的一种用于执行位姿确定方法的电子设备的结构示意图,所述电子设备可以包括:至少一个处理器401,例如CPU,至少一个通信接口402,至少一个存储器403和至少一个通信总线404。其中,通信总线404用于实现这些组件直接的连接通信。其中,本申请实施例中设备的通信接口402用于与其他节点设备进行信令或数据的通信。存储器403可以是高速RAM存储器,也可以是非易失性的存储器(non-volatile memory),例如至少一个磁盘存储器。存储器403可选的还可以是至少一个位于远离前述处理器的存储装置。存储器403中存储有计算机可读取指令,当所述计算机可读取指令由所述处理器401执行时,电子设备例如可以执行上述图1所示方法过程。Please refer to FIG. 4. FIG. 4 is a schematic structural diagram of an electronic device for performing a pose determination method provided by an embodiment of the present application. The electronic device may include: at least one processor 401, such as a CPU, and at least one communication interface 402 , at least one memory 403 and at least one communication bus 404 . Wherein, the communication bus 404 is used to realize the direct connection and communication of these components. Wherein, the communication interface 402 of the device in the embodiment of the present application is used for signaling or data communication with other node devices. The memory 403 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory 403 may also be at least one storage device located far away from the aforementioned processor. Computer-readable instructions are stored in the memory 403 , and when the computer-readable instructions are executed by the processor 401 , the electronic device may, for example, execute the above-mentioned method process shown in FIG. 1 .
可以理解,图4所示的结构仅为示意,所述电子设备还可包括比图4中所示更多或者更少的组件,或者具有与图4所示不同的配置。图4中所示的各组件可以采用硬件、软件或其组合实现。It can be understood that the structure shown in FIG. 4 is only for illustration, and the electronic device may also include more or less components than those shown in FIG. 4 , or have a configuration different from that shown in FIG. 4 . Each component shown in FIG. 4 may be implemented by hardware, software or a combination thereof.
本申请实施例提供一种可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时,执行如图1所示方法实施例中电子设备所执行的方法过程。An embodiment of the present application provides a readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the method process performed by the electronic device in the method embodiment shown in FIG. 1 is executed.
本实施例公开一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,计算机能够执行上述各方法实施例所提供的方法,例如,该方法可以包括:获取待处理图像集;所述待处理图像集中包括目标对象对应的多个目标图像;针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿信息;确定所述目标对象的初始模型;基于所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型 和所述目标对象的目标位姿。This embodiment discloses a computer program product. The computer program product includes a computer program, and the computer program includes program instructions. When the program instructions are executed by a computer, the computer can execute the methods provided in the above method embodiments. For example, the method may include: acquiring a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to the target object; for each target image, based on the target key point information of the target image, determine the The initial pose information of the target object; determining the initial model of the target object; based on the initial model and the initial pose information corresponding to each of the plurality of target images, determining the target model of the target object and the The target pose of the target object.
在本申请所提供的实施例中,应该理解到,所揭露装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
另外,作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。In addition, a unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
再者,在本申请各个实施例中的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。Furthermore, each functional module in each embodiment of the present application may be integrated to form an independent part, each module may exist independently, or two or more modules may be integrated to form an independent part.
在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。In this document, relational terms such as first and second etc. are used only to distinguish one entity or operation from another without necessarily requiring or implying any such relationship between these entities or operations. Actual relationship or sequence.
以上所述仅为本申请的实施例而已,并不用于限制本申请的保护范围,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above descriptions are only examples of the present application, and are not intended to limit the scope of protection of the present application. For those skilled in the art, various modifications and changes may be made to the present application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of this application shall be included within the protection scope of this application.
工业实用性Industrial Applicability
本申请提供了一种位姿确定方法、装置、电子设备和可读存储介质。通过首先获取与目标模型存在差别的初始模型,然后基于该初始模型与多个目标图像中分别示出的目标对象的初始位姿信息,同时确定出目标对象的目标模型和目标位姿,以使得不用预先生成目标模型以及无需设计目标对象与目标模型之间的匹配算法,从而解决了在目标对象的姿态估计过程中对象模型无法预先生成以及目标对象与目标模型之间匹配困难的问题。The present application provides a pose determining method, device, electronic equipment and readable storage medium. By first obtaining an initial model that is different from the target model, and then based on the initial model and the initial pose information of the target object shown in multiple target images, simultaneously determine the target model and target pose of the target object, so that There is no need to pre-generate the target model and no need to design a matching algorithm between the target object and the target model, thereby solving the problems that the target object model cannot be pre-generated and the matching between the target object and the target model is difficult during the attitude estimation process of the target object.
此外,可以理解的是,本申请的位姿确定方法、装置、电子设备和可读存储介质是可以重现的,并且可以用在多种工业应用中。例如,本申请的位姿确定方法、装置、电子设备和可读存储介质可以用于诸如车辆、无人机等对应的位姿估计过程。In addition, it can be understood that the pose determining method, device, electronic device and readable storage medium of the present application are reproducible and can be used in various industrial applications. For example, the pose determination method, device, electronic device, and readable storage medium of the present application can be used in corresponding pose estimation processes such as vehicles and drones.

Claims (14)

  1. 一种位姿确定方法,其特征在于,包括:A pose determination method, characterized in that, comprising:
    获取待处理图像集;所述待处理图像集中包括目标对象对应的多个目标图像;Acquire a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to the target object;
    针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿信息;For each of the target images, based on the target key point information of the target image, determine the initial pose information of the target object;
    确定所述目标对象的初始模型;determining an initial model of the target object;
    基于所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。Based on the initial model and the initial pose information corresponding to each of the plurality of target images, determine a target model of the target object and a target pose of the target object.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, further comprising:
    获取与所述目标对象相匹配的先验信息;所述先验信息用于表征所述目标对象的结构信息和/或尺寸信息;Obtain prior information matching the target object; the prior information is used to characterize the structure information and/or size information of the target object;
    所述基于所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿,包括:The determining the target model of the target object and the target pose of the target object based on the initial model and the initial pose information corresponding to each of the plurality of target images includes:
    基于所述先验信息、所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。Based on the prior information, the initial model, and the initial pose information corresponding to each of the multiple target images, determine a target model of the target object and a target pose of the target object.
  3. 根据权利要求2所述的方法,其特征在于,所述基于所述先验信息、所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿,包括:The method according to claim 2, wherein the target of the target object is determined based on the prior information, the initial model, and the initial pose information corresponding to each of the plurality of target images. Model and target pose of the target object, including:
    基于所述先验信息、所述初始模型以及多个所述初始位姿信息,计算最小误差值;其中,所述最小误差值通过对象规格误差值和重投影误差值进行表征;所述对象规格误差值表征得到的对象模型所对应的结构信息和/或尺寸信息与所述先验信息之间的误差;所述重投影误差值表征得到的对象模型在对应的目标图像中的投影点与对应关键点之间的重投影误差;Calculate a minimum error value based on the prior information, the initial model, and a plurality of the initial pose information; wherein, the minimum error value is characterized by an object specification error value and a reprojection error value; the object specification The error value represents the error between the structure information and/or size information corresponding to the obtained object model and the prior information; the reprojection error value represents the projected point of the obtained object model in the corresponding target image and the corresponding reprojection error between keypoints;
    将得到所述最小误差值时优化得到的对象模型确定为所述目标模型,以及将得到所述最小误差值时优化得到的位姿确定为所述目标位姿。The object model optimized when the minimum error value is obtained is determined as the target model, and the pose optimized when the minimum error value is obtained is determined as the target pose.
  4. 根据权利要求3所述的方法,其特征在于,所述最小误差值基于以下步骤得到:The method according to claim 3, wherein the minimum error value is obtained based on the following steps:
    在检测到误差值小于误差阈值时,将该误差值确定为所述最小误差值;或者When it is detected that the error value is less than the error threshold, the error value is determined as the minimum error value; or
    在迭代次数达到迭代上限时,将多个误差值中最小的误差值确定为所述最小误差值;其中,达到迭代上限时所对应的迭代次数阈值与所述待处理图像集中包括所述目标图像的数量匹配。When the number of iterations reaches the upper limit of iterations, the smallest error value among the plurality of error values is determined as the minimum error value; wherein, when the upper limit of iterations is reached, the threshold of the number of iterations corresponding to the set of images to be processed includes the target image The number matches.
  5. 根据权利要求3或4所述的方法,其特征在于,所述基于所述先验信息、所述初始 模型以及多个所述初始位姿信息,计算最小误差值,包括:The method according to claim 3 or 4, wherein the calculation of the minimum error value based on the prior information, the initial model and a plurality of initial pose information includes:
    针对每一次迭代计算,基于所述先验信息以及当次优化得到的所述目标对象的当次模型,计算当次对象规格误差值;以及For each iterative calculation, based on the prior information and the current model of the target object obtained by the current optimization, calculate the current object specification error value; and
    基于所述当次模型以及当次对应的初始位姿信息,计算当次重投影误差值;以及Based on the current model and the current corresponding initial pose information, calculate the current reprojection error value; and
    基于所述当次对象规格误差值和所述当次重投影误差值,确定所述最小误差值。The minimum error value is determined based on the current object specification error value and the current reprojection error value.
  6. 根据权利要求5所述的方法,其特征在于,所述基于所述当次对象规格误差值和所述当次重投影误差值,确定所述最小误差值,包括:The method according to claim 5, wherein the determining the minimum error value based on the current object specification error value and the current reprojection error value comprises:
    将所述当次对象规格误差值与所述当次重投影误差值所对应的乘积或和,确定为当次迭代计算得到的误差值;以及determining the product or sum corresponding to the object specification error value of the current time and the reprojection error value of the current time as the error value calculated by the current iteration; and
    基于所述当次迭代计算得到的误差值以及历史迭代计算得到的误差值,确定所述最小误差值。The minimum error value is determined based on the error value calculated by the current iteration and the error value calculated by historical iterations.
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述目标关键点信息包括目标关键点的位置信息;以及The method according to any one of claims 1 to 6, wherein the target key point information includes position information of the target key point; and
    所述针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿信息,包括:For each of the target images, determining the initial pose information of the target object based on the target key point information of the target image includes:
    基于所述目标关键点的位置信息以及相机标定信息,确定该目标关键点在世界坐标系下的初始位置信息;所述世界坐标系包括以所述目标对象的运动平面作为坐标面的坐标系;Based on the position information of the target key point and the camera calibration information, determine the initial position information of the target key point in the world coordinate system; the world coordinate system includes a coordinate system with the motion plane of the target object as a coordinate plane;
    确定与所述初始位置信息匹配的初始偏航角信息;determining initial yaw angle information matching the initial position information;
    基于所述初始位置信息和所述初始偏航角信息,确定所述目标对象的初始位姿信息。Based on the initial position information and the initial yaw angle information, determine initial pose information of the target object.
  8. 根据权利要求7所述的方法,其特征在于,所述确定与所述初始位置信息匹配的初始偏航角信息,包括:The method according to claim 7, wherein the determining the initial yaw angle information matching the initial position information comprises:
    在目标角度范围内,选取多个角度与所述初始位置信息进行匹配,确定所述目标关键点对应的模型关键点在各个角度下分别对应的重投影误差值;Within the target angle range, select a plurality of angles to match the initial position information, and determine the reprojection error values corresponding to the model key points corresponding to the target key points at each angle;
    将最小的重投影误差值对应的角度信息确定为所述初始偏航角信息。The angle information corresponding to the smallest reprojection error value is determined as the initial yaw angle information.
  9. 根据权利要求1至8中任一项所述的方法,其特征在于,所述初始模型预先基于以下步骤确定:The method according to any one of claims 1 to 8, wherein the initial model is determined in advance based on the following steps:
    针对对象模型库中的每一个对象模型,确定表征该对象模型的模型关键点在目标图像中的投影点;以及For each object model in the object model library, determine the projection points of the model key points representing the object model in the target image; and
    确定所述投影点与所述目标图像的对应关键点之间的重投影误差值;determining a reprojection error value between the projected point and a corresponding keypoint of the target image;
    将最小的重投影误差值所对应的对象模型确定为所述初始模型。The object model corresponding to the smallest reprojection error value is determined as the initial model.
  10. 根据权利要求1至9中任一项所述的方法,其特征在于,所述待处理图像集包括从所述目标对象的运动路径中确定的多个目标图像。The method according to any one of claims 1 to 9, wherein the set of images to be processed includes a plurality of target images determined from the moving path of the target object.
  11. 一种位姿确定装置,其特征在于,包括:A pose determining device is characterized in that it comprises:
    获取模块,被配置成用于获取待处理图像集;所述待处理图像集中包括目标对象对应的多个目标图像;An acquisition module configured to acquire a set of images to be processed; the set of images to be processed includes a plurality of target images corresponding to the target object;
    第一确定模块,被配置成用于针对每一个所述目标图像,基于该目标图像的目标关键点信息,确定所述目标对象的初始位姿信息;The first determination module is configured to, for each of the target images, determine the initial pose information of the target object based on the target key point information of the target image;
    第二确定模块,被配置成用于确定所述目标对象的初始模型;a second determining module configured to determine an initial model of the target object;
    第三确定模块,被配置成用于基于所述初始模型以及所述多个目标图像各自对应的所述初始位姿信息,确定所述目标对象的目标模型和所述目标对象的目标位姿。The third determination module is configured to determine the target model of the target object and the target pose of the target object based on the initial model and the initial pose information respectively corresponding to the plurality of target images.
  12. 一种电子设备,其特征在于,包括处理器以及存储器,所述存储器存储有计算机可读取指令,当所述计算机可读取指令由所述处理器执行时,运行根据权利要求1至10中任一项所述的方法。An electronic device, characterized by comprising a processor and a memory, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the operation according to claims 1 to 10 any one of the methods described.
  13. 一种可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时运行根据权利要求1至10中任一项所述的方法。A readable storage medium on which a computer program is stored, wherein the computer program executes the method according to any one of claims 1 to 10 when executed by a processor.
  14. 一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,所述计算机能够执行根据权利要求1至10中任一项所述的方法。A computer program product, the computer program product comprising a computer program, the computer program comprising program instructions, when the program instructions are executed by a computer, the computer is capable of performing the Methods.
PCT/CN2022/105549 2021-08-13 2022-07-13 Pose determination method and apparatus, electronic device, and readable storage medium WO2023016182A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110932980.9A CN113793251A (en) 2021-08-13 2021-08-13 Pose determination method and device, electronic equipment and readable storage medium
CN202110932980.9 2021-08-13

Publications (1)

Publication Number Publication Date
WO2023016182A1 true WO2023016182A1 (en) 2023-02-16

Family

ID=79181869

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/105549 WO2023016182A1 (en) 2021-08-13 2022-07-13 Pose determination method and apparatus, electronic device, and readable storage medium

Country Status (2)

Country Link
CN (1) CN113793251A (en)
WO (1) WO2023016182A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113793251A (en) * 2021-08-13 2021-12-14 北京迈格威科技有限公司 Pose determination method and device, electronic equipment and readable storage medium
CN114332939B (en) * 2021-12-30 2024-02-06 浙江核新同花顺网络信息股份有限公司 Pose sequence generation method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10186049B1 (en) * 2017-03-06 2019-01-22 URC Ventures, Inc. Determining changes in object structure over time using mobile device images
CN109816704A (en) * 2019-01-28 2019-05-28 北京百度网讯科技有限公司 The 3 D information obtaining method and device of object
CN112991436A (en) * 2021-03-25 2021-06-18 中国科学技术大学 Monocular vision SLAM method based on object size prior information
CN113034582A (en) * 2021-03-25 2021-06-25 浙江商汤科技开发有限公司 Pose optimization device and method, electronic device and computer readable storage medium
CN113793251A (en) * 2021-08-13 2021-12-14 北京迈格威科技有限公司 Pose determination method and device, electronic equipment and readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108122256B (en) * 2017-12-25 2018-10-12 北京航空航天大学 A method of it approaches under state and rotates object pose measurement
CN111310574B (en) * 2020-01-17 2022-10-14 清华大学 Vehicle-mounted visual real-time multi-target multi-task joint sensing method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10186049B1 (en) * 2017-03-06 2019-01-22 URC Ventures, Inc. Determining changes in object structure over time using mobile device images
CN109816704A (en) * 2019-01-28 2019-05-28 北京百度网讯科技有限公司 The 3 D information obtaining method and device of object
CN112991436A (en) * 2021-03-25 2021-06-18 中国科学技术大学 Monocular vision SLAM method based on object size prior information
CN113034582A (en) * 2021-03-25 2021-06-25 浙江商汤科技开发有限公司 Pose optimization device and method, electronic device and computer readable storage medium
CN113793251A (en) * 2021-08-13 2021-12-14 北京迈格威科技有限公司 Pose determination method and device, electronic equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHANG YANG, SUN XIAOLIANG; ZHANG YUEQIANG; LI YOU; YU QIFENG: "Research on 3D Target Pose Tracking and Modeling", ACTA GEODAETICA ET CARTOGRAPHICA SINICA, vol. 47, no. 6, 30 June 2018 (2018-06-30), pages 799 - 808, XP093034918, ISSN: 1001-1595, DOI: 10.11947/j.AGCS.2018.20170626 *

Also Published As

Publication number Publication date
CN113793251A (en) 2021-12-14

Similar Documents

Publication Publication Date Title
Fan et al. Pothole detection based on disparity transformation and road surface modeling
CN108764048B (en) Face key point detection method and device
WO2023016271A1 (en) Attitude determining method, electronic device, and readable storage medium
Bae et al. High-precision vision-based mobile augmented reality system for context-aware architectural, engineering, construction and facility management (AEC/FM) applications
JP6031554B2 (en) Obstacle detection method and apparatus based on monocular camera
CA2826534C (en) Backfilling points in a point cloud
EP3457357A1 (en) Methods and systems for surface fitting based change detection in 3d point-cloud
WO2023016182A1 (en) Pose determination method and apparatus, electronic device, and readable storage medium
Tang et al. ESTHER: Joint camera self-calibration and automatic radial distortion correction from tracking of walking humans
Ji et al. RGB-D SLAM using vanishing point and door plate information in corridor environment
CN113450579B (en) Method, device, equipment and medium for acquiring speed information
WO2023071790A1 (en) Pose detection method and apparatus for target object, device, and storage medium
WO2023284358A1 (en) Camera calibration method and apparatus, electronic device, and storage medium
Dong et al. FSD-SLAM: a fast semi-direct SLAM algorithm
US20220343601A1 (en) Weak multi-view supervision for surface mapping estimation
KR20190060679A (en) Apparatus and method for learning pose of a moving object
CN114972492A (en) Position and pose determination method and device based on aerial view and computer storage medium
Seo et al. An efficient detection of vanishing points using inverted coordinates image space
Jo et al. Mixture density-PoseNet and its application to monocular camera-based global localization
CN114648639B (en) Target vehicle detection method, system and device
WO2022107548A1 (en) Three-dimensional skeleton detection method and three-dimensional skeleton detection device
JP2023065296A (en) Planar surface detection apparatus and method
Saito et al. In-plane rotation-aware monocular depth estimation using slam
JP2023056466A (en) Global positioning device and method for global positioning
Kang et al. 3D urban reconstruction from wide area aerial surveillance video

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22855165

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE