WO2023142353A1

WO2023142353A1 - Pose prediction method and apparatus

Info

Publication number: WO2023142353A1
Application number: PCT/CN2022/100638
Authority: WO
Inventors: 陈星鑫; 庞敏健; 万培佩; 刘贤焯
Original assignee: 奥比中光科技集团股份有限公司
Priority date: 2022-01-26
Filing date: 2022-06-23
Publication date: 2023-08-03
Also published as: CN114593735A

Abstract

A pose prediction method and apparatus. Under the condition that a target device loses visual information, the pose of the target device is estimated only using a motion speed parameter of the target device. Even if the visual information of the target device is not collected, the pose of the target device is still estimated, such that a SLAM system can be ensured to continue working and running by utilizing the estimated pose, thereby ensuring the robustness of the SLAM system.

Description

A pose prediction method and device

This application claims the priority of the Chinese patent application with the application number 202210093752.1 and the title of the invention "a method and device for pose prediction" submitted to the China Patent Office on January 26, 2022, the entire contents of which are incorporated in this application by reference middle.

technical field

The present application relates to the technical field of positioning, and in particular to a pose prediction method and device.

Background technique

With the development of science and technology, cutting-edge technologies such as AR/VR, robots, and unmanned driving are developing rapidly, and the fields of AR/VR, robots, and unmanned driving technology all involve autonomous positioning technology. Autonomous positioning technology is used to determine the pose of the device (robot/unmanned vehicle/mobile phone, etc.).

At present, in outdoor positioning technology, GPS positioning technology is usually used, while indoors or places with poor GPS signals need to use other positioning technologies, such as indoor UWB, Bluetooth, motion capture systems, etc. Today, SLAM (Simultaneous Localization and Mapping) has gradually become an important indoor positioning technology. SLAM is a method that uses its own sensors to perceive the environment, calculate its own pose in real time, and build incremental maps. Technology, it does not need to modify the external environment, and its positioning accuracy can reach centimeter level. Its application fields can include AR/VR, robots, unmanned driving, drones, etc.

The positioning of the existing SLAM system is very dependent on visual information. If the environment texture is weak and the equipment moves faster within a certain period of time, the image in the visual information collected by the SLAM system will appear as weak texture, occlusion or Blurred, which affects the normal operation of the SLAM system. For example, when the visual information is not collected, the SLAM system cannot estimate the pose of the device, which makes the SLAM system unable to work and cannot be restarted, and must wait until the visual information is re-collected. to restart and work. Therefore, there is an urgent need for a technical solution that can solve the problem that the SLAM system cannot work and cannot be restarted because the SLAM system cannot estimate the pose of the device when the visual information is not collected.

Contents of the invention

In view of this, the embodiment of the present application provides a pose prediction method, device, computer equipment, and computer-readable storage medium to solve the problem that the SLAM system cannot perform the pose prediction of the device when no visual information is collected in the prior art. It is estimated that the SLAM system cannot work and cannot be restarted.

The first aspect of the embodiment of the present application provides a pose prediction method, the method comprising:

Obtaining the movement speed parameter of the target device at a previous moment; wherein the movement speed parameter includes the angular velocity and linear acceleration of the target device, and the previous moment is a historical moment before the target device loses visual information;

Estimate the predicted displacement increment of the target device at each current moment by using the motion speed parameter and the preset displacement prediction model; wherein, the current moment is the latest moment when the target device loses visual information and after losing visual information every moment of

calculating the predicted pose corresponding to each current moment of the target device according to the predicted displacement increment of the target device at each current moment, and constructing a predicted motion trajectory of the target device;

Optimizing the predicted pose and the predicted motion trajectory of the target device, and acquiring the target pose and corresponding target motion trajectory of the target device at each current moment.

The second aspect of the embodiment of the present application provides a device for pose prediction, the device comprising:

A parameter acquisition module, which acquires a movement speed parameter of the target device at a previous moment; wherein, the movement speed parameter includes the angular velocity and linear acceleration of the target device, and the previous moment is a historical moment before the target device loses visual information;

An increment determination module, configured to estimate the predicted displacement increment of the target device at each current moment by using the motion speed parameter and the preset displacement prediction model; wherein, the current moment is when the target device loses visual information The latest moment and every moment after the loss of visual information;

A pose and trajectory prediction module, configured to calculate a predicted pose of the target device corresponding to each current moment according to the predicted displacement increment of the target device at each current moment, and construct a predicted motion trajectory of the target device;

An optimization module, configured to optimize the predicted pose and the predicted motion trajectory of the target device, and acquire target poses and corresponding target motion trajectories of the target device at each current moment.

The third aspect of the embodiments of the present application provides a pose prediction system, including an inertial navigation sensor and a terminal device, wherein the inertial navigation sensor is used for the movement speed parameter of the terminal device at the previous moment, and the The terminal device is used to realize the steps of the above method.

A fourth aspect of the embodiments of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and operable on the processor, and the processor implements the steps of the above method when executing the computer program.

A fifth aspect of the embodiments of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above method are implemented.

Compared with the prior art, the present application has the beneficial effect that the target device’s pose can be estimated only by using the target device’s motion velocity parameters without the need for visual information of the target device; In the case of the visual information of the target device, the pose of the target device can still be estimated, so as to ensure that the SLAM system can continue to work using the estimated pose, thereby ensuring the robustness of the SLAM system.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the accompanying drawings that need to be used in the descriptions of the embodiments or the prior art will be briefly introduced below. Obviously, the accompanying drawings in the following description are only for the present application For some embodiments, those skilled in the art can also obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of a pose prediction system provided by an embodiment of the present application;

Fig. 2 is a flow chart of the pose prediction method provided by the embodiment of the present application;

Fig. 3 is a schematic diagram of the network architecture of the displacement prediction model provided by the embodiment of the present application;

Fig. 4 is a schematic diagram of an optimized pose graph provided by an embodiment of the present application;

FIG. 5 is a block diagram of a pose prediction device provided in an embodiment of the present application;

Fig. 6 is a schematic diagram of a computer device provided by an embodiment of the present application.

Detailed ways

In the following description, specific details such as specific system structures and technologies are presented for the purpose of illustration rather than limitation, so as to thoroughly understand the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

The present invention provides a pose prediction method, which obtains the motion speed parameters of the target device at the previous moment; uses the motion speed parameters and the preset displacement prediction model to estimate the predicted displacement increment of the target device at each current moment; The predicted displacement increment calculates the predicted pose of the target device at each current moment, and constructs the predicted motion trajectory; optimizes the predicted pose and predicted motion trajectory at each current moment, and obtains the target pose and the corresponding target device at each current moment. target trajectory.

Fig. 1 is a schematic diagram of a pose prediction system provided in this embodiment, which includes an inertial sensor (Inertial Measurement Unit, IMU) 1 and a terminal device 2, wherein, in one implementation, the terminal device 2 can be a server, or Various electronic devices that support data processing, including but not limited to smartphones, sweepers, tablets, laptops and desktop computers, etc. The IMU1 can be set on a target device, and the target device can be a device that needs to be positioned, for example, the target device can be an AR/VR device, a robot, a vehicle, and the like.

Specifically, the IMU1 can measure the movement speed parameter of the target device at the previous moment, and send the movement speed parameter to the terminal device 2 . After the terminal device 2 obtains the movement speed parameter of the target device at the previous moment, it can obtain the predicted displacement increment of the target device according to the movement speed parameter and the preset displacement prediction model; the terminal device 2 can obtain the predicted displacement increment of the target device , to determine the predicted pose and predicted motion trajectory of the target device; the terminal device 2 can optimize the predicted pose and predicted motion trajectory of the target device to obtain the target pose and target motion trajectory of the target device.

It should be noted that the above application scenarios are only illustrated for the convenience of understanding the present application, and the implementation manners of the present application are not limited in this regard. On the contrary, the embodiments of the present application can be applied to any applicable scene.

Fig. 2 is a flow chart of a pose prediction method provided by an embodiment of the present application. A pose prediction method in FIG. 2 may be executed by the data processing device 2 in FIG. 1 . As shown in Figure 2, the pose prediction method includes:

S201: Obtain a movement speed parameter of the target device at a previous moment; the movement speed parameter includes the angular velocity and the linear acceleration of the target device.

In this embodiment, the target device can be understood as a device that needs to predict a pose, for example, the target device can be an AR/VR device, a mobile phone, an autonomously moving robot, or an unmanned vehicle. In an implementation manner, the target device may be equipped with a SLAM (Simultaneous Localization and Mapping) system, wherein the SLAM system includes an image acquisition device (such as a camera) and an IMU.

Specifically, the image acquisition device may collect the visual information of the target device every preset time period, for example, the image acquisition device may collect the visual information of the target device every 30 ms. Wherein, the visual information of the target device may be understood as image frames around the target device, for example, image frames of the front side, rear side, left side, and right side of the target device.

The IMU can measure the motion speed parameters of the target device at preset intervals, for example, the IMU can measure the motion speed parameters of the target device at a frequency of 100 Hz. Wherein, the moving speed parameter of the target device can be understood as the speed data of the target device during the moving process. In an implementation manner, the movement speed parameter may include the angular velocity and linear acceleration in the device coordinate system of the target device, that is, the angular velocity and linear acceleration in the body coordinate system (ie, body frame) of the target device. Since the IMU includes three single-axis accelerometers and three single-axis gyroscopes, the accelerometers can be used to detect the respective linear accelerations of the target device on the x-axis, y-axis, and z-axis in the device coordinate system. The gyroscope detects the respective angular velocities of the target device on the x-axis, y-axis, and z-axis in the device coordinate system; it can be understood that the movement speed parameters can include the x-axis, y-axis, and z-axis in the device coordinate system of the target device respective angular velocity and linear acceleration. It should be noted that the IMU may also be a 9-axis IMU, that is, include a magnetometer, and the IMU is not specifically limited in this embodiment.

As an example, in order to estimate the pose of the target device even if the target device loses visual information, this embodiment uses the IMU to obtain the motion velocity parameters of the target device at the previous moment to predict the pose of the target device at the current moment. It should be noted that the previous moment is a historical moment before the target device loses the visual information.

It should be noted that, in an implementation manner of this embodiment, before the step of acquiring the movement speed parameter of the target device at the previous moment, the method may further include the following steps:

judging whether the visual information of the target device is detected;

If the visual information of the target device is not detected, the acquisition of the movement speed parameter of the target device at the previous moment is performed.

It can be understood that in this implementation, if the visual information of the target device is not detected, it means that the visual information of the target device is lost, so it is not possible to use the visual information to estimate the pose of the target device; at this time, it is necessary to obtain the target device The motion speed parameters at the previous moment, so that the motion speed parameters of the target device can be used to estimate the pose of the target device.

More specifically, assuming that the target device is equipped with a SLAM system, in the main function of the SLAM tracking thread, when the system state of the SLAM system is normal, the SLAM system is in the normal tracking state; when the visual information is lost, the SLAM system enters the tracking loss state , at this time, the motion velocity parameters collected by the IMU can be used to perform integral operations to obtain the recursive pose, but since the reliability of the recursive position obtained through the integral operation is not high, when the loss of visual information exceeds the preset duration ( For example, after 1s), if the visual information has not returned to normal, the SLAM system enters the tracking loss state. At this time, the step of obtaining the motion speed parameter of the target device at the previous moment needs to be executed.

S202: Estimate the predicted displacement increment of the target device at each current moment by using the motion speed parameter and the preset displacement prediction model.

Wherein, the current moment is the latest moment when the target device loses the visual information and every moment after the visual information is lost.

In this embodiment, after obtaining the movement speed parameters of the target device at historical moments, the preset displacement prediction model can be used to determine the subsequent predicted displacement increment of the target device at each current moment according to the movement speed parameters, wherein the target device The predicted displacement increment at the current moment can be understood as the predicted variation of displacement at each moment during the time when the target device loses visual information.

As an embodiment, the target rotation matrix may be determined according to the angular velocity of the target device in the device coordinate system at the previous moment. Preferably, the angular velocity under the device coordinate system of the target device is integrated to obtain the target rotation matrix, wherein the target rotation matrix can be understood as converting the motion velocity parameter from the device coordinate system (ie body frame) to the preset world of the SLAM system The rotation matrix (that is, the rotation Rwb) in the coordinate system (world frame).

Further use the target rotation matrix to rotate the angular velocity and linear acceleration in the equipment coordinate system to obtain the angular velocity and linear acceleration in the world coordinate system, and input the angular velocity and linear acceleration in the world coordinate system into the preset displacement prediction model to obtain the target equipment Predicted displacement increments at each current moment.

It can be understood that the input of the preset displacement prediction model can be the angular velocity and linear acceleration of the x-axis, y-axis, and z-axis in the world coordinate system of a fixed time length (such as 1s); it should be noted that, assuming a fixed time length is 1s, since the data of angular velocity and linear acceleration at one moment is 6-dimensional, if the acquisition frequency of angular velocity and linear acceleration is 250HZ, then there are 250 pieces of angular velocity and linear acceleration data collected in 1s. It is understandable that the preset The input tensor size of the displacement prediction model is batch_size×6×250. The output of the preset displacement prediction model is the predicted displacement increment of the x-axis, y-axis, and z-axis of the target device in the world coordinate system, that is, within a period of time, the x-axis and y-axis of the target device in the world coordinate system , The displacement increment on the z axis.

In one embodiment, the preset displacement prediction model includes a plurality of cascaded convolutional layers and an output layer, as shown in FIG. The layers are connected, and the output layer is composed of a global average pooling layer (AngPool1d) and a convolutional layer (Conv1d) in series, thereby retaining the spatial structure of the network, greatly reducing the number of parameters of the model, reducing the reasoning time, and preventing The technical effect of overfitting. It should be noted that the displacement prediction model also includes a BN layer and a relu layer, but they are omitted in Figure 3, where c in Figure 3 represents the number of channels, and k represents the size of the volume.

Further, in the process of performing model training on the preset displacement prediction model in advance, the model can be supervised by using a loss function, wherein the loss function is preferably a mean-square error (Mean-Square Error, MSE) loss function. MSE LOSS (that is, the MSE loss function) can be shown in the following formula:

Among them, L _mse is MSE LOSS,

is the three-dimensional displacement increment output by the displacement prediction model,

is the true value (that is, the real value) of the three-dimensional displacement increment, i∈[0,n], representing the i-th moment.

It should be noted that the network structure of the displacement prediction model can also be a one-dimensional form of resnet18, TCN, LSTM and other neural network structures, which are not limited here.

S203: Calculate the predicted pose of the target device corresponding to each current moment according to the predicted displacement increment of the target device at each current moment, and construct a predicted motion trajectory of the target device.

After determining the predicted displacement increment at each current moment after the target device loses visual information, the predicted displacement increment at each current moment and the pose information of the target device at the previous moment can be used to obtain the prediction of the target device at each current moment Pose information, and then use all the predicted pose information at the current moment to construct the predicted motion trajectory of the target device.

As an example, the predicted displacement increment of the target device includes a predicted displacement increment on the X axis, a predicted displacement increment on the Y axis, and a predicted displacement increment on the Z axis. Preferably, the target is determined according to the predicted displacement increment on the X-axis, the predicted displacement increment on the Y-axis, the predicted displacement increment on the Z-axis, and the pose information of the target device at the previous moment. The predicted pose of the device on the X-axis, the predicted pose on the Y-axis, and the predicted pose on the Z-axis at the current moment, for example, the pose information of the target device at the previous moment (that is, the x-axis, y-axis, Coordinates on the z-axis) add the predicted displacement increment of the target device on the X-axis, the predicted displacement increment on the Y-axis, and the predicted displacement increment on the Z-axis to obtain the predicted position of the target device on the X-axis at the current moment pose, the predicted pose on the Y axis, and the predicted pose on the Z axis (that is, the predicted coordinate value). Then, according to the predicted pose of the target device on the X-axis, the predicted pose on the Y-axis, and the predicted pose on the Z-axis at each previous moment, the predicted position of the target device at each current moment after losing visual information can be determined Attitude information, that is, the predicted position of the target device on the X-axis, the predicted position on the Y-axis, and the predicted position on the Z-axis at the current moment can be used as the predicted pose information of the target device at the current moment; the predicted position of the target device at the current moment The pose information is regarded as the pose information of the previous moment, and the above calculation is repeated to predict the predicted pose of the next current moment, so as to obtain the predicted pose of each current moment during the time period when the target device loses visual information.

Further, during the time period when the target device loses visual information, the predicted motion trajectory of the target device is constructed using the predicted pose information of the displacement at each current moment, as shown by the solid line in FIG. 4 .

It should be understood that, in addition to using deep learning methods for pose estimation, this application can also perform pose estimation through zero-speed update combined with Kalman filter (EKF), Step Counting and other methods, and there is no limitation here.

S204: Optimizing the predicted pose and predicted motion trajectory of the target device, and obtaining the target pose and target motion trajectory of the target device at each current moment.

It should be noted that although the target device's motion velocity parameters can be used to estimate the predicted pose of the target device, in order to ensure the accuracy of the determined running trajectory of the target device, it is necessary to continuously check whether there is a visual image of the target device collected. Information, so that the predicted motion trajectory of the target device can be corrected according to the collected visual information of the target device.

In a specific implementation manner, step S204 also includes:

If the visual information of the target device is detected and the visual information satisfies the relocation conditions, the predicted pose of the target device is optimized according to the image frame in the visual information to obtain the target pose.

More specifically, it may first be determined whether the detected visual information of the target device (for example, after two image frames are collected) satisfies the relocation condition, wherein the relocation condition may be that the target device can be positioned according to the visual information , determine the target pose of the target device, that is, after detecting the visual information, judge whether the image frame in the visual information matches successfully with an image frame in the global map or local map pre-established by the SLAM system; if the visual information The image frame in the SLAM system is successfully matched with an image frame in the global map or local map pre-established by the SLAM system, indicating that the visual information satisfies the relocation condition, so that the target device can be located according to the visual information and the target of the target device can be determined Conversely, if the image frame in the visual information cannot match any image frame in the pre-established global map or local map of the SLAM system, it means that the visual information does not meet the relocation conditions, so it cannot be based on The visual information locates the target device to determine the target pose of the target device.

After the visual information of the target device is detected and it is determined that the visual information satisfies the relocation condition, a pose optimization may be performed on the predicted pose of the target device according to the image frames in the visual information to obtain a target pose.

As an example, the target pose corresponding to the target device may be determined first according to the image frame in the visual information. Specifically, the image frame that matches the image frame in the visual information can be determined in the global map or local map pre-established in the SLAM system, wherein the similarity between the image frame in the visual information and the matched image frame is greater than Preset threshold, or, the matching image frame is the largest similarity between all image frames in the map and the image frame in the visual information; then, according to the image frame in the visual information and the matching image frame in the map Image frames, determine the ideal pose of the target device.

After the ideal pose corresponding to the target device is determined, the predicted pose and predicted motion trajectory before the visual information is not detected are corrected and optimized according to the ideal pose, and the target pose and target motion trajectory of the target device are obtained. Specifically, the predicted pose before no visual information is detected can be obtained first, that is, the predicted pose of the target device is determined using a preset displacement prediction model; and then the predicted pose before no visual information is detected according to the ideal pose The predicted motion trajectory corresponding to the pose and predicted pose is optimized to obtain the target pose and target motion trajectory of the target device. Preferably, the predicted pose is used as the optimization variable, and the error of the relative motion estimation between the predicted pose and the ideal pose (that is, the predicted motion trajectory corresponding to the predicted pose of the target device) is used as an error term to construct an optimization model, the The optimization model is shown in the following formula:

Among them, T _i is the predicted pose of the target device at the i-th moment; σ is the covariance of the residual;

Represents the two-norm (least squares) of the sum of residuals; min() represents minimization; T={T _i } _i∈[0,n] represents the predicted pose to be optimized; ΔT _m is the predicted pose The observed value of the relative pose with the ideal pose; r() is a residual function, which represents the residual between the relative pose between the predicted pose to be optimized and the ideal pose and its observed value, specifically:

Then, the optimization model is solved to obtain the target motion trajectory, where the starting point and end point of the visual information loss stage in the target motion trajectory are the detected pose before the loss of visual information and the ideal pose of the target device, respectively. That is, after optimization, the error of the predicted motion trajectory corresponding to the predicted pose of the target device will be reduced, and the head and tail of the trajectory in the visual information loss stage in the optimized target motion trajectory will be compared with the SLAM system based on the collected visual information. Normal estimated trajectory alignment; for example, as shown in Figure 4, the solid line segment in the visual information loss stage is the predicted motion trajectory corresponding to the predicted pose, and the dashed line segment is the optimized target motion trajectory. Perform pose optimization on the predicted pose of the target device according to the optimized target motion trajectory to obtain the target pose, so that the SLAM system can continue tracking based on a relatively accurate absolute position (ie, the target pose).

It should be noted that if the visual information of the target device is detected, and the visual information does not satisfy the relocation condition, then continue to perform the acquisition of the movement speed parameter of the target device at the previous moment until the visual information of the target device is detected, and The visual information satisfies the relocation criteria. Of course, in order to avoid the problem that the SLAM system cannot be restarted when the visual information is not collected for a long time, in an implementation of this embodiment, if the target device is not detected within a preset time period (such as 20s), visual information, select System Reboot.

All the above optional technical solutions may be combined in any way to form optional embodiments of the present application, which will not be repeated here.

The following are device embodiments of the present application, which can be used to implement the method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.

Fig. 5 is a schematic diagram of a pose prediction device provided by an embodiment of the present application. As shown in Figure 5, the pose prediction device includes:

The parameter acquisition module 501 is configured to acquire the movement speed parameter of the target device at a previous moment; wherein, the movement speed parameter includes the angular velocity and linear acceleration of the target device, and the previous moment is a historical moment before the target device loses visual information;

The incremental calculation module 502 is used to estimate the predicted displacement increment of the target device at each current moment by using the motion speed parameter and the preset displacement prediction model; wherein, the current moment is the latest moment when the target device loses visual information and after the visual information is lost every moment of

The pose and trajectory prediction module 503 is used to calculate the predicted pose of the target device corresponding to each current moment according to the predicted displacement increment of each current moment of the target device, and construct the predicted motion trajectory of the target device;

The optimization module 504 is configured to optimize the predicted pose and predicted motion trajectory of the target device, and obtain the target pose and corresponding target motion trajectory of the target device at each current moment.

Optionally, the incremental calculation module 502 is specifically used for:

Determine the target rotation matrix according to the angular velocity of the target device in the device coordinate system at the previous moment;

Use the target rotation matrix to convert the angular velocity and linear acceleration in the device coordinate system to the angular velocity and linear acceleration in the world coordinate system;

Input the angular velocity and linear acceleration in the world coordinate system into the preset displacement prediction model to obtain the predicted displacement increment of the target equipment at each current moment.

Optionally, the predicted displacement increment of the target device includes a predicted displacement increment on the X axis, a predicted displacement increment on the Y axis, and a predicted displacement increment on the Z axis;

The position determination module 503 is specifically used for:

According to the predicted displacement increment on the X-axis, the predicted displacement increment on the Y-axis, the predicted displacement increment on the Z-axis of the target device at each current moment, and the position information of the target device at the previous moment, determine the target device's position The predicted position on the X axis, the predicted position on the Y axis, and the predicted position on the Z axis;

According to the predicted position of the target device on the X-axis, the predicted position on the Y-axis and the predicted position on the Z-axis, the predicted poses of the target device at each current moment are determined.

Optionally, the device also includes a detection module; the detection module is used for:

Judging whether the visual information of the target device is detected;

If the visual information of the target device is not detected, then perform the acquisition of the motion speed parameter of the target device at the previous moment and perform the above-mentioned pose prediction method.

Optional, optimization module, specifically for:

If it is detected that the visual information of the target device satisfies the relocation conditions, the predicted pose and predicted motion trajectory of the target device are optimized according to the image frames in the visual information to obtain the target pose and target motion trajectory.

Optional, optimization module, specifically for:

According to the image frame in the visual information, determine the ideal pose corresponding to the target device;

Optimize the predicted motion trajectory according to the ideal pose and the predicted pose detected before the visual information is not detected, and determine the predicted motion trajectory;

According to the target motion trajectory, the predicted pose of the target device is optimized to obtain the target pose.

Optionally, the optimization module is also used to:

If it is detected that the visual information of the target device does not satisfy the relocation condition, continue to obtain the movement speed parameter of the target device at the previous moment.

Optionally, the optimization module is also used to:

If the visual information of the target device is not detected within the preset time period, it will restart.

It should be understood that the sequence numbers of the steps in the above embodiments do not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

FIG. 6 is a schematic diagram of a terminal device 6 provided by an embodiment of the present application. The terminal device 6 includes: a processor 601 , a memory 602 and a computer program 603 stored in the memory 602 and capable of running on the processor 601 . When the processor 601 executes the computer program 603, the steps in the foregoing method embodiments are implemented. Alternatively, when the processor 601 executes the computer program 603, the functions of the modules/modules in the foregoing device embodiments are realized.

Exemplarily, the computer program 603 can be divided into one or more modules/modules, and one or more modules/modules are stored in the memory 602 and executed by the processor 601 to complete the present application. One or more modules/modules may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program 603 in the computer device 6.

The terminal device 6 may include, but not limited to, a processor 601 and a memory 602 . Those skilled in the art can understand that FIG. 6 is only an example of the terminal device 6, and does not constitute a limitation on the terminal device 6. It may include more or less components than those shown in the figure, or combine certain components, or different components. , for example, computer equipment may also include input and output equipment, network access equipment, bus, and so on.

The processor 601 can be a central processing unit (Central Processing Unit, CPU), or other general processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), on-site Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.

The storage 602 may be an internal storage module of the terminal device 6 , for example, a hard disk or memory of the terminal device 6 . The memory 602 can also be an external storage device of the computer device 6, for example, a plug-in hard disk equipped on the terminal device 6, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash memory card ( Flash Card), etc. Further, the memory 602 may also include both an internal storage module of the terminal device 6 and an external storage device. The memory 602 is used to store computer programs and other programs and data required by the computer equipment. The memory 602 can also be used to temporarily store data that has been output or will be output.

Those skilled in the art can clearly understand that for the convenience and brevity of description, only the above functional modules and the division of modules are used for illustration. In practical applications, the above functions can be assigned to different functional modules, Module completion means that the internal structure of the device is divided into different functional modules or modules to complete all or part of the functions described above. The functional modules and modules in the embodiments can be integrated into one processing module, or each module can exist separately physically, or two or more modules can be integrated into one module, and the above-mentioned integrated modules can either use hardware It can also be implemented in the form of software function modules. In addition, the specific names of the functional modules and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of modules and modules in the above-mentioned system, reference may be made to the corresponding process in the aforementioned method embodiments, which will not be repeated here.

In the above-mentioned embodiments, the descriptions of each embodiment have their own emphases, and for parts that are not detailed or recorded in a certain embodiment, refer to the relevant descriptions of other embodiments.

Those skilled in the art can appreciate that the modules and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.

In the embodiments provided in this application, it should be understood that the disclosed apparatus/computer equipment and methods can be implemented in other ways. For example, the device/computer device embodiments described above are only illustrative, for example, the division of modules or modules is only a logical function division, and there may be other division methods in actual implementation, and multiple modules or components can be Incorporation may either be integrated into another system, or some features may be omitted, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.

A module described as a separate component may or may not be physically separated, and a component shown as a module may or may not be a physical module, that is, it may be located in one place, or may also be distributed to multiple network modules. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in each embodiment of the present application may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules.

If the integrated modules/modules are implemented in the form of software function modules and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the present application realizes all or part of the processes in the methods of the above embodiments, and can also be completed by instructing related hardware through computer programs. The computer programs can be stored in computer-readable storage media, and the computer programs can be processed. When executed by the controller, the steps in the above-mentioned method embodiments can be realized. A computer program may include computer program code, which may be in source code form, object code form, executable file, or some intermediate form or the like. The computer-readable medium may include: any entity or device capable of carrying computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory, read-only memory (Read-Only Memory, ROM), random access Memory (Random Access Memory, RAM), electrical carrier signal, telecommunication signal and software distribution medium, etc. It should be noted that the content contained in computer readable media may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, computer readable media may not Including electrical carrier signals and telecommunication signals.

The above embodiments are only used to illustrate the technical solutions of the present application, rather than to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still apply to the foregoing embodiments Modifications to the technical solutions recorded, or equivalent replacements for some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of each embodiment of the application, and should be included in this application. within the scope of protection.

Claims

A pose prediction method, is characterized in that, described method comprises:

Obtaining the movement speed parameter of the target device at a previous moment; wherein the movement speed parameter includes the angular velocity and linear acceleration of the target device, and the previous moment is a historical moment before the target device loses visual information;

Estimate the predicted displacement increment of the target device at each current moment by using the motion speed parameter and the preset displacement prediction model; wherein, the current moment is the latest moment when the target device loses visual information and after losing visual information every moment of

calculating the predicted pose corresponding to each current moment of the target device according to the predicted displacement increment of the target device at each current moment, and constructing a predicted motion trajectory of the target device;

Optimizing the predicted pose and the predicted motion trajectory of the target device, and acquiring the target pose and corresponding target motion trajectory of the target device at each current moment.
The method according to claim 1, wherein the estimating the predicted displacement increment of the target device at each current moment by using the motion speed parameter and the preset displacement prediction model includes:

Determine the target rotation matrix according to the angular velocity of the target device in the device coordinate system at the previous moment;

Using the target rotation matrix, converting the angular velocity and linear acceleration in the device coordinate system into angular velocity and linear acceleration in the world coordinate system;

Inputting the angular velocity and linear acceleration in the world coordinate system into the preset displacement prediction model to obtain the predicted displacement increment of the target device at each current moment.
The method according to claim 2, wherein the predicted displacement increment of the target device includes a predicted displacement increment on the X axis, a predicted displacement increment on the Y axis, and a predicted displacement increment on the Z axis;

The calculating the predicted pose corresponding to each current moment of the target device according to the predicted displacement increment of the target device at each current moment includes:

According to the predicted displacement increment on the X-axis, the predicted displacement increment on the Y-axis, the predicted displacement increment on the Z-axis of the target device at each current moment, and the position information of the target device at the previous moment, determine the The predicted position on the X-axis, the predicted position on the Y-axis and the predicted position on the Z-axis of the target device;

According to the predicted position of the target device on the X-axis, the predicted position on the Y-axis, and the predicted position on the Z-axis, the predicted poses of the target device at each current moment are determined.
The method according to any one of claims 1 to 3, wherein the preset displacement prediction model includes a plurality of cascaded convolutional layers and an output layer; wherein, the output layer and the plurality of The last level of convolutional layers in the cascaded convolutional layers is connected, and the output layer is composed of a global average pooling layer and a convolutional layer in series; the loss function of the displacement prediction model is a mean square error loss function.
The method according to any one of claims 1 to 3, characterized in that, before the step of acquiring the movement speed parameter of the target device at the previous moment, the method further comprises:

judging whether the visual information of the target device is detected;

If the visual information of the target device is not detected, the step of acquiring the movement speed parameter of the target device at the previous moment is executed.
The method according to claim 5, characterized in that, optimizing the predicted pose and the predicted motion trajectory of the target device, and obtaining the target pose and the corresponding target pose of the target device at each current moment The target movement trajectory also includes:

Judging whether the visual information of the target device satisfies the relocation condition, and if so, optimizing the predicted pose and predicted motion trajectory of the target device according to the image frames in the visual information to obtain the target pose and target motion trajectory; if not satisfied, continue to execute the step of acquiring the motion speed parameter of the target device at the previous moment; wherein, the relocation condition is to judge the image frame in the visual information after the visual information is detected Whether it is successfully matched with an image frame in the pre-established global map or local map.
The method according to claim 6, characterized in that, optimizing the predicted pose and the predicted motion trajectory of the target device, and obtaining the target pose and the corresponding target pose of the target device at each current moment Target trajectory, including:

According to the image frame in the visual information, determine the ideal pose corresponding to each current moment of the target device;

Optimizing the predicted motion trajectory according to the ideal pose and the predicted pose estimated before the visual information is not detected, and determining the target motion trajectory;

Perform pose optimization on the predicted poses of the target device at each current moment according to the target motion trajectory to obtain the target poses of the target device at each current moment.
The method according to any one of claims 1 to 3, characterized in that, calculating the predicted pose corresponding to each current moment of the target device according to the predicted displacement increment of the target device at each current moment, And after the step of constructing the predicted motion trajectory of the target device, the method further includes:

If the visual information of the target device is not detected within the preset time period, the system is restarted.
A pose prediction device, characterized in that said device comprises:

A parameter acquisition module, which acquires a movement speed parameter of the target device at a previous moment; wherein, the movement speed parameter includes the angular velocity and linear acceleration of the target device, and the previous moment is a historical moment before the target device loses visual information;

An increment determination module, configured to estimate the predicted displacement increment of the target device at each current moment by using the motion speed parameter and the preset displacement prediction model; wherein, the current moment is when the target device loses visual information The latest moment and every moment after the loss of visual information;

A pose and trajectory prediction module, configured to calculate a predicted pose of the target device corresponding to each current moment according to the predicted displacement increment of the target device at each current moment, and construct a predicted motion trajectory of the target device;

An optimization module, configured to optimize the predicted pose and the predicted motion trajectory of the target device, and acquire target poses and corresponding target motion trajectories of the target device at each current moment.
A pose prediction system, characterized in that the system includes an inertial navigation sensor and a terminal device, wherein the inertial navigation sensor is used for the movement speed parameter of the target device at the previous moment, and the terminal device is used for Carrying out the steps of the method as claimed in any one of claims 1 to 8.
A terminal device, comprising a memory, a processor, and a computer program stored in the memory and capable of running on the processor, characterized in that, when the processor executes the computer program, the following claims 1 to 1 are implemented. 8. The steps of any one of the methods.
A computer-readable storage medium storing a computer program, wherein the computer program implements the steps of the method according to any one of claims 1 to 8 when executed by a processor.