CN117635697A

CN117635697A - Pose determination method, pose determination device, pose determination equipment, storage medium and program product

Info

Publication number: CN117635697A
Application number: CN202210992094.XA
Authority: CN
Inventors: 张皓原
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-08-17
Filing date: 2022-08-17
Publication date: 2024-03-01

Abstract

The present disclosure relates to a pose information confirmation method, apparatus, device, storage medium, and program product, the method comprising: acquiring a camera frame in a sliding window; determining pose information to be optimized corresponding to the camera frame; acquiring issuing pose information corresponding to the key frames in the sliding window; determining prior constraint conditions based on the issuing pose information and the pose to be optimized; and carrying out sliding window optimization on the pose information to be optimized based on the prior constraint condition to obtain target pose information corresponding to the camera frame. The embodiment of the disclosure integrates the issuing pose information corresponding to the key frame into the prior constraint condition, thereby solving the problem of transmission delay.

Description

Pose determination method, pose determination device, pose determination equipment, storage medium and program product

Technical Field

The present disclosure relates to the field of positioning processing technologies, and in particular, to a pose determining method, apparatus, device, storage medium, and program product.

Background

SLAM (Simultaneous Localization and Mapping, synchronous localization and mapping) refers to the process of an electronic device constructing a map of the surrounding environment by performing visual image acquisition on the environment in an unknown environment, and completing self localization. With the development of science and technology, SLAM has important applications in mobile device positioning, such as: mobile robots, AR (Augmented Reality), VR (Virtual Reality), unmanned aerial vehicles, virtual visual positioning systems, mobile intelligent terminals, wearable devices, etc.

Localization fusion has been an important topic of SLAM in real-world applications, especially in some mobile devices, for example: cell phones, wearable devices, etc. When the SLAM algorithm is used for positioning fusion, a sliding window (moving window) algorithm is often used for optimization, the real-time performance requirement of the mobile device on the algorithm is higher, and particularly for wearable devices, severe state changes can occur in a short time, so that the number of key frame frames contained in the sliding window for optimization at the time of design is very small due to a visual inertial odometer scheme (VIO) on the mobile device.

When the number of key frames included in the sliding window is small, the key frames are used for positioning fusion, which results in long positioning time delay.

Disclosure of Invention

In order to solve the technical problems, an embodiment of the present disclosure provides a pose determining method, apparatus, device, storage medium and program product, where issuing pose information corresponding to a key frame is integrated into a prior constraint condition, so as to solve the problem of transmission delay.

In a first aspect, an embodiment of the present disclosure provides a pose determining method, which is applied to a visual inertial odometer, including:

Acquiring a camera frame in a sliding window;

determining pose information to be optimized corresponding to the camera frame;

acquiring issuing pose information corresponding to the key frames in the sliding window;

determining prior constraint conditions based on the issuing pose information and the pose to be optimized;

and carrying out sliding window optimization on the pose information to be optimized based on the prior constraint condition to obtain target pose information corresponding to the camera frame.

In a second aspect, embodiments of the present disclosure provide a pose determination apparatus configured in a positioning system including a visual odometer, comprising:

the camera frame acquisition module is used for acquiring the camera frames in the sliding window;

the pose to be optimized determining module is used for determining pose information to be optimized corresponding to the camera frame;

the issuing pose information acquisition module is used for acquiring issuing pose information corresponding to the key frames in the sliding window;

the constraint condition determining module is used for determining a priori constraint condition based on the issuing pose information and the pose to be optimized;

and the target pose determining module is used for carrying out sliding window optimization on the pose information to be optimized based on the prior constraint condition to obtain the target pose information corresponding to the camera frame.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including:

one or more processors;

a storage means for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the pose determination method according to any of the first aspects described above.

In a fourth aspect, an embodiment of the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the pose determination method according to any of the first aspects described above.

In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program or instructions which, when executed by a processor, implement a pose determination method according to any of the first aspects above.

The embodiment of the disclosure provides a pose determining method, a pose determining device, pose determining equipment, a storage medium and a program product, wherein the pose determining method comprises the following steps: acquiring a camera frame in a sliding window; determining pose information to be optimized corresponding to the camera frame; acquiring issuing pose information corresponding to the key frames in the sliding window; determining prior constraint conditions based on the issuing pose information and the pose to be optimized; and carrying out sliding window optimization on the pose information to be optimized based on the prior constraint condition to obtain target pose information corresponding to the camera frame. The embodiment of the disclosure integrates the issuing pose information corresponding to the key frame in the prior constraint condition, thereby solving the problem of transmission delay.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

Fig. 1 is a flow chart of a pose determining method in an embodiment of the present disclosure;

fig. 2 is a schematic structural view of a pose determining apparatus in an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of an electronic device in an embodiment of the disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

The main positioning fusion modes of the current mainstream mainly comprise: positioning fusion based on multiple sensors and positioning fusion based on prior maps.

In the main approach of multi-sensor based positioning fusion, the fusion positioning is implemented in combination with GPS data in outdoor scenes or lidar used in autopilot. Whereas mobile devices are typically Visual-Inertial Odometry (VIO) for fused positioning, i.e. combined camera and inertial measurement unit (Inertial Measurement Unit, IMU); in indoor scenes, it is common to arrange the signal transmitters at some known locations, such as: bluetooth, wifi or special markers (reflective bars or two-dimensional codes) to provide additional positioning information.

The positioning fusion scheme based on multiple sensors has high requirements on the sensors and scenes, is difficult to apply on a large scale, and particularly has higher resources and cost at the mobile equipment end.

The prior map-based positioning fusion mode is mainly used for constructing a prior map in a similar scene, and then the map is utilized to provide the prior positioning result for the next use, and the positioning result is sent to a system running in real time, so that the overall positioning precision is improved.

The prior map is needed in a positioning fusion mode based on the prior map, so that the offline map is needed to be constructed first; the main problem is that the map positioning needs to be time-consuming, including transmission information, pose information calculation and the like. This time consuming approach is not friendly for some mobile device-side real-time scenarios.

In the embodiments of the present disclosure, key names that may appear below are briefly described.

The synchronous positioning and mapping (Simultaneous Localization and Mapping, SLAM) is the leading direction of the vision field space positioning technology, and mainly solves the problems of positioning and map construction of a robot in unknown environment motion. In the virtual reality VR or AR, according to synchronous positioning and mapping SLAM, obtaining a map and corresponding rendering of a superimposed virtual object image by a current view angle, so that the reality of the virtual object is enhanced; in the field of unmanned aerial vehicles, synchronous positioning and mapping SLAM construct a local map to assist the unmanned aerial vehicle in autonomous obstacle avoidance and the like. Synchronous positioning and mapping SLAM technology coverage is very wide. The sensor may be classified into a 2D/3D SLAM based on a laser radar, an RGBD SLAM based on a depth camera, a visual SLAM based on a visual sensor, and a VIO (Visual Inertial Odometry, VIO) based on a visual sensor and an inertial unit.

The visual inertial odometer VIO integrates the data of a camera and an inertial running unit IMU (Inertial Motion Unit, IMU) to realize synchronous positioning and mapping SLAM algorithm, and the camera and the inertial running unit IMU have good complementarity. In the visual inertial odometer VIO, the real scale of the camera track can be estimated by aligning the estimated pose of the inertial motion unit IMU with the estimated pose of the camera, the inertial motion unit IMU can predict the pose of the image frame and the position of the feature point at the previous moment in the next frame image, the matching speed of the feature tracking algorithm and the robustness of the algorithm for fast rotation are improved, and finally the gravity vector provided by the accelerometer in the inertial motion unit IMU can convert the estimated position into a real three-dimensional space coordinate system.

The positioning process of the visual inertial odometer VIO mainly comprises five parts: image and IMU data preprocessing, initialization, local nonlinear optimization, loop detection and global optimization. Image and IMU data preprocessing: for the image, extracting characteristic points, carrying out optical flow tracking by utilizing the KLT pyramid, and preparing for solving the pose of the camera only by visual initialization. And for the IMU, pre-integrating the IMU data to obtain the pose, the speed and the rotation angle of the current moment, and simultaneously calculating the pre-integration increment between adjacent frames to be used in the back-end optimization, and a pre-integrated covariance matrix and a jacobian matrix. Initializing: in the initialization, firstly, only visual initialization is carried out, and the relative pose of a camera is calculated; and then carrying out alignment solving on the initialization parameters with IMU pre-integration. Local nonlinear optimization: and (3) performing nonlinear optimization on visual inertial navigation of the sliding window, namely, placing visual constraint and IMU constraint in a large objective function to perform optimization. The local optimization refers to optimizing variables in a window of a current camera frame and n frames before the current camera frame, and outputting a more accurate pose by local nonlinear optimization. And loop detection: the loop detection is to save the key frame of the image detected before, and when the key frame returns to the same place where the key frame passes through, the key point detection judges whether the key frame passes through the place or not through the matching relation of the characteristic points. Global optimization: the global optimization is to perform nonlinear optimization by using camera constraint and IMU constraint and adding constraint of loop detection when loop detection occurs. Global optimization is performed on the basis of local optimization, and more accurate pose is output.

It should be noted that, the pose determining method provided in the present disclosure is mainly applied to a local nonlinear optimization part in the visual inertial odometer VIO, and is an optimization for a camera frame in a sliding window. The camera frame in the sliding window obtained in the embodiment of the disclosure refers to target pose information corresponding to the current camera frame in the sliding window, and the target pose information corresponding to the current camera frame in the sliding window refers to pose information with relatively accurate local nonlinear optimization output.

The pose determining method according to the embodiment of the present application will be described in detail below with reference to the accompanying drawings.

Fig. 1 is a flowchart of a pose determining method according to an embodiment of the present disclosure, where the present embodiment may be applicable to a case of performing local pose optimization in a visual odometer VIO, and the method may be performed by a pose determining apparatus, which may be implemented in a software and/or hardware manner, and the pose determining apparatus may be configured in an electronic device.

For example: the electronic device may be a mobile terminal, a fixed terminal, or a portable terminal, such as a mobile handset, a site, a unit, a device, a multimedia computer, a multimedia tablet, an internet node, a communicator, a desktop computer, a laptop computer, a notebook computer, a netbook computer, a tablet computer, a Personal Communication System (PCS) device, a personal navigation device, a Personal Digital Assistants (PDA), an audio/video player, a digital camera/camcorder, a positioning device, a television receiver, a radio broadcast receiver, an electronic book device, a game device, or any combination thereof, including the accessories and peripherals of these devices or any combination thereof.

And the following steps: the electronic device may be a server, where the server may be an entity server, or may be a cloud server, and the server may be a server, or a server cluster.

As shown in fig. 1, the pose determining method provided by the embodiment of the present disclosure mainly includes steps S101 to S105.

S101, acquiring a camera frame in a sliding window.

In the video field, a video or the like may be considered as a plurality of pictures that are continuously transformed over time, where a frame refers to each picture, and a camera frame refers to a specific picture acquired by a camera of an electronic device.

Specifically, in the process of synchronous positioning and mapping SLAM, as the number of key frames and road points increases, the back-end beam method adjustment BA model is increased continuously, the calculation amount of the model is increased continuously, the calculation efficiency is reduced continuously, and in order to avoid such a situation, sliding windows (Sliding windows) are used for limiting the camera frames to be optimized to a certain number so as to control the scale of the beam method adjustment BA model.

The sliding window may be a sliding window based on a time domain, a sliding window based on a frequency domain, or a sliding window combining the time domain and the frequency domain. The manner of determining the sliding window is not limited in this embodiment.

In the embodiments of the present disclosure, the time-domain based sliding window is taken as an example, and the time-domain based sliding window may be understood as a sliding window that continuously moves forward with time. Wherein the length of the sliding window does not change over time. For example: the length of one sliding window comprises 20 image frames, then after the camera of the electronic device acquires a new image frame, the acquired new image frame is added to the sliding window, and the earliest acquired image frame in the sliding window is discarded. The discarding of the image frames is not literally a direct deletion, because the direct discarding of the variables results in a loss of information. Instead, these image frames are added to the edge optimization conditions of the sliding window to avoid loss of information.

Specifically, in the embodiment of the present disclosure, a camera frame in the sliding window may be understood as the latest image frame acquired by the camera in the electronic device and added to the sliding window. Acquiring a camera frame in the sliding window may be understood as acquiring the latest one of the image frames in the sliding window.

S102, pose information to be optimized corresponding to the camera frame is determined.

The pose information comprises position coordinates, height, orientation information and the like of the camera. The pose information to be optimized can be understood as pose information which corresponds to the camera frame and is not optimized through a sliding window.

The position coordinates of the camera may be two-dimensional coordinates of the camera in a world coordinate system, that is, coordinates of the camera in a Z plane, the height of the camera may be a height value of the camera in the world coordinate system, that is, a height of the camera from the Z plane, and the orientation information may be orientation information of the camera in the world coordinate system. Further, the position coordinates of the camera of the present embodiment may be obtained by a positioning technique, and the orientation information of the camera may be obtained by an orientation meter, a gyroscope, or the like.

S103, acquiring issuing pose information corresponding to the key frames in the sliding window.

The key frame in the sliding window can be any image frame in the sliding window. Specifically, a key frame refers to a frame in which a key action is located in a character or object motion change in an animation or video. Based on the data association amount of the oldest key frame and the previous frame in the sliding window, judging whether the key frame is the key frame, and it is required to be noted that the time requirement of the present disclosure on the oldest key frame is not performed, and the time distance may be closer or further, but the index of the selected key frame needs to satisfy: the number of frames from the previous key frame, the spatial distance from the nearest key frame, and the tracking quality, i.e. the quality of the key frame must be high enough, because the key frame is equivalent to the skeleton of the synchronous localization and mapping SLAM.

In one embodiment of the present disclosure, in sliding window optimization, the first frame in the sliding window is used as a key frame, and the first frame in the sliding window may refer to the image frame with the earliest acquisition time in the sliding window.

The issuing pose information corresponding to the key frame can be understood as pose information obtained by processing the local pose information corresponding to the key frame. The local pose information corresponding to the key frame refers to pose information which is determined in the key frame and does not need to be processed.

In one embodiment of the present disclosure, the pose information obtained after the processing of the local pose information corresponding to the key frame may be executed by the local end of the electronic device, or the electronic device may upload the local pose information corresponding to the key frame to the positioning server, and the positioning server processes the local pose information corresponding to the key frame to obtain the issuing pose information corresponding to the key frame.

In one embodiment of the present disclosure, the method further comprises: determining a first frame in the sliding window as a key frame; and sending the local pose information corresponding to the key frame to the positioning server so that the positioning server processes the local pose information to obtain the issuing pose information corresponding to the key frame.

First, it should be noted that the positioning server may be a local server or a cloud server. The present embodiment is not particularly limited. Among other things, embodiments of the present disclosure require a new thread to implement location services. The location server is named as m_loc, and the location server m_loc may be a module of the cloud, mainly used for implementing a location function, and only can run one task at a time.

Further, before each sliding window optimization, a status query request is sent to the positioning server, where the status query request is used to instruct the positioning server m_loc to return to the working status of its own end. When the working state returned by the positioning server M_LOC is a busy state, the local follow-up procedure, namely the local original sliding window optimization procedure, is continuously executed; if the working state returned by the positioning server M_LOC is an idle state, the local pose information T corresponding to the key frame is obtained _{local_wi} Uploading to the positioning server M_LOC to enable the positioning server M_LOC to carry out local pose information T corresponding to the key frames _{local_wi} And processing to obtain the issuing pose information corresponding to the key frame.

It should be noted that, the location server m_loc needs to be communicatively connected to a plurality of electronic devices, and provides location services for the plurality of electronic devices. The working state of the positioning server m_loc is a busy state, which can be understood as that the positioning server m_loc is performing a task of performing positioning services for other electronic devices or is performing other tasks of the positioning server m_loc itself.

In the embodiment of the disclosure, the local pose information corresponding to the key frame is uploaded to the positioning server, and the positioning server processes the local pose information corresponding to the key frame to obtain the issued pose information corresponding to the key, so that the local calculated amount of the electronic equipment can be reduced.

In one embodiment of the present disclosure, the processing, by the positioning server, the local pose information to obtain the issued pose information corresponding to the keyframe includes: and if the local pose information is image data, the positioning server processes the image data based on map data stored by a local end and a PnP algorithm to obtain issuing pose information corresponding to the key frame.

Specifically, pnP (Perselect-n-Points, pnP) refers to the object motion positioning problem of 3D to 2D point pairs, that is, the pose information of a camera is calculated by knowing the coordinates of an object in the world coordinate system and the pixel coordinates of the object in the image plane of the camera, and in specific calculation, n is greater than 4.

According to the embodiment of the disclosure, when the cloud map is stored in the positioning server, the positioning server receives image data sent by the electronic device, and processes the image data based on a PnP algorithm and the cloud map to obtain the sending pose information corresponding to the key frame.

In one embodiment of the present disclosure, the processing, by the positioning server, the local pose information to obtain the issued pose information corresponding to the keyframe includes: if the local pose information is a pose matrix, the positioning server acquires a timestamp corresponding to the pose matrix; and the positioning server acquires positioning information corresponding to the global positioning system based on the time stamp corresponding to the pose matrix and takes the positioning information as issuing pose information corresponding to the key frame.

The positioning information corresponding to the ball positioning system can be one or more of a GPS positioning system, a Beidou positioning navigation system and the like.

In the embodiment of the disclosure, a positioning server reads a time stamp corresponding to a pose matrix, does not acquire positioning information acquired by the time stamp in a global positioning system, and takes the positioning information as issuing pose information corresponding to a key frame.

In one embodiment of the present disclosure, obtaining issuing pose information corresponding to a key frame in the sliding window includes: sending a pose query request to a positioning server, wherein the pose query request is used for indicating the positioning server to query whether the issuing pose information exists at the local end; and if the issuing pose information exists in the positioning server, acquiring the issuing pose information corresponding to the key frame in the sliding window from the positioning server.

Specifically, the positioning server may be a cloud server, and has a main function of realizing visual positioning, and can only run one task at a time, and the positioning server cannot complete the positioning request service of other devices when in a working running state. The issuing pose information fuses the local pose information corresponding to the key frame.

Before each sliding window optimization, the electronic equipment sends a pose query request to the positioning server, wherein the pose query request is used for indicating the positioning server to query whether the issued pose information exists at the local end; and if the issuing pose information exists in the positioning server, the electronic equipment acquires the issuing pose information corresponding to the key frame in the sliding window from the positioning server. And if the issuing pose information does not exist in the positioning server, the positioning server returns a message that the issuing pose information does not exist to the electronic equipment.

Specifically, before each sliding window optimization, the electronic device queries whether the issuing pose information corresponding to the key frame exists in the positioning server M_LOC; if the location service has no issued pose information corresponding to the key frame in M_LOC, the electronic equipment continues to execute the original sliding window optimization flow; if the location service module M_LOC has the issuing pose information corresponding to the key frame, the electronic equipment acquires the issuing pose information from the location server and marks the issuing pose information as T _{mloc_wj} 。

S104, determining prior constraint conditions based on the issuing pose information and the pose to be optimized.

In the process of acquiring the issuing pose information, it is necessary to ensure that the time coordinates of the issuing pose information and the local pose information are consistent. If the issuing pose information exists in the positioning server M_LOC, the electronic equipment acquires the issuing pose information from the positioning server, wherein the issuing pose information is obtained by the local pose information sent to the positioning server M_LOC by the electronic equipment. For example, the time node is i, and the local pose information is marked as T _{local_wi} The lower pose information stored after M_LOC processing of the positioning service is marked as T _{mloc_wi} Issuing pose information T _{mloc_wi} Contains key frame information and position information with time node i. The prior constraint calculation requires issuing pose information, local upstream pose information and latest key frame information in a sliding window, and the prior constraint obtained by calculation is added to the sliding window optimization, so that positioning fusion is realized.

In one embodiment of the present disclosure, determining a priori constraint based on the issuing pose information and the pose to be optimized includes: and determining prior constraint conditions based on the issuing pose information, the uploading pose information and the pose information to be optimized.

In particular, the prior constraint may be understood as that based on knowledge of the camera frame, the prior probability of the corresponding information, for example, an image may be generally represented by a matrix, and each position may take a value of 0 to 255, but the matrix generated randomly is a natural image, and many constraints, such as local smoothing, need to be satisfied, where the constraints are obtained by summarizing experience of people on the natural image, that is, the prior constraint may be used as the prior constraint of the actual image by the prior condition. In the embodiment of the disclosure, the prior constraint conditions are fused with issuing pose information, local pose information and pose information to be optimized, and the prior constraint conditions are formulated together.

In one embodiment of the disclosure, the issuing pose information includes an issuing pose matrix, the local pose information includes a local pose matrix, and the pose information to be optimized includes a pose matrix to be optimized; determining a priori constraint condition based on the issuing pose information, the local pose information and the pose to be optimized comprises the following steps: and determining the product of the issuing pose matrix, the local pose matrix and the current pose matrix as a priori constraint condition.

In an embodiment of the present disclosure, a priori constraints for issuing poses are foundAnd adds the pose information T to be optimized _{local_wj} And (3) upper part. In sliding window optimization, +.>As T _{mloc_wj} The prior constraint of (2) is added into the pose optimization of the sliding window, so that the fusion effect is realized. Wherein, the issuing pose is marked as T _{mloc_wi,} Suffix i is by outgoing T _{local_wi} Obtained.

Wherein,for the prior constraint condition disclosed by the disclosure, the matrix calculation method is issuing a pose matrix T _{mloc_wi} Reversible matrix T of local pose matrix ^-1 _{query_wtmp} And pose matrix T to be optimized _{local_wj} Is a product of (a) and (b).

The electronic equipment can save the local pose matrix, so that when the prior constraint condition is calculated, the problem of time delay transmission does not exist when the prior constraint condition is integrated into the local pose matrix, the calculated amount is small compared with the size of the Jacobian matrix which is optimized in the whole sliding window and is required, the integral optimization speed is not influenced, and the positioning effect can be obviously improved.

S105, sliding window optimization is conducted on the pose information to be optimized based on the prior constraint condition, and target pose information corresponding to the camera frame is obtained.

According to the pose determining method provided by the embodiment of the disclosure, the local pose matrix sent to the positioning server M_LOC by the electronic equipment is stored, so that the problem that timeliness of information sending is affected due to the fact that a sliding window is too small in range is solved; according to the embodiment of the disclosure, the prior constraint in sliding window optimization contains the local pose information corresponding to the last key frame, so that the problem of ductility in transmission is solved; in addition, the prior constraint of the sliding window optimization adds the reversible matrix of the local pose matrix corresponding to the key frame, and compared with the jacobian matrix scale of the sliding window optimization, the calculation amount increased by the embodiment of the disclosure does not influence the overall sliding window optimization speed.

Fig. 2 is a schematic structural diagram of a pose determining device according to an embodiment of the present disclosure, where the embodiment is applicable to a pose information processing case, the pose determining device may be implemented in software and/or hardware, and the pose information processing device may be configured in an electronic device. The electronic equipment comprises an intelligent terminal with a pose information processing function, and comprises a smart phone, a notebook computer, a tablet personal computer, a digital camera/video camera, game equipment and the like. Optionally, the intelligent terminal comprises a touch screen.

As shown in fig. 2, the pose determining apparatus 20 provided in the embodiment of the present disclosure mainly includes: the system comprises a camera frame acquisition module 21, a pose to be optimized determination module 22, a issuing pose acquisition module 23, a constraint condition determination module 24 and a target pose generation module 25.

Wherein, the camera frame acquisition module 21 is used for acquiring the camera frame in the sliding window;

the pose to be optimized determining module 22 is configured to determine pose to be optimized information corresponding to the camera frame;

a issuing pose acquisition module 23, configured to acquire issuing pose information corresponding to a key frame in the sliding window;

a constraint condition determining module 24, configured to determine a priori constraint condition based on the issuing pose information and the pose to be optimized;

And the target pose determining module 25 is configured to perform sliding window optimization on the pose information to be optimized based on the prior constraint condition, so as to obtain target pose information corresponding to the camera frame.

The embodiment of the disclosure provides a pose determining device, which is used for executing the following procedures: acquiring a camera frame in a sliding window; determining pose information to be optimized corresponding to the camera frame; acquiring issuing pose information corresponding to the key frames in the sliding window; determining prior constraint conditions based on the issuing pose information and the pose to be optimized; and carrying out sliding window optimization on the pose information to be optimized based on the prior constraint condition to obtain target pose information corresponding to the camera frame. The embodiment of the disclosure integrates the issuing pose information corresponding to the key frame into the prior constraint condition, thereby solving the problem of transmission delay.

In one possible embodiment, the apparatus further comprises: the key frame determining module is used for determining a first frame in the sliding window as a key frame; and the local pose sending module is used for sending the local pose information corresponding to the key frame to the positioning server so that the positioning server processes the local pose information to obtain the issuing pose information corresponding to the key frame.

Sending a pose query request to a positioning server, wherein the pose query request is used for indicating the positioning server to query whether the issuing pose information exists at the local end;

and if the issuing pose information exists in the positioning server, acquiring the issuing pose information corresponding to the key frame in the sliding window from the positioning server.

The issuing pose acquisition module 23 sends a pose query request to a positioning server, wherein the pose query request is used for indicating the positioning server to query whether the issuing pose information exists at the local end; and if the issuing pose information exists in the positioning server, acquiring the issuing pose information corresponding to the key frame in the sliding window from the positioning server.

In a possible implementation manner, the processing, by the positioning server, the local pose information to obtain the issued pose information corresponding to the key frame includes: and if the local pose information is image data, the positioning server processes the image data based on map data stored by a local end and a PnP algorithm to obtain issuing pose information corresponding to the key frame.

In a possible implementation manner, the processing, by the positioning server, the local pose information to obtain the issued pose information corresponding to the key frame includes: if the local pose information is a pose matrix, the positioning server acquires a timestamp corresponding to the pose matrix; and the positioning server acquires positioning information corresponding to the global positioning system based on the time stamp corresponding to the pose matrix and takes the positioning information as issuing pose information corresponding to the key frame.

In one possible implementation, the constraint condition determining module 24 is specifically configured to determine a priori constraint condition based on the issuing pose information, the local pose information, and the pose to be optimized.

In one possible implementation, the issuing pose information includes an issuing pose matrix, the local pose information includes a local pose matrix, and the pose information to be optimized includes a pose matrix to be optimized; the constraint condition determining module 24 is specifically configured to determine a product of the issuing pose matrix, the local pose matrix, and the current pose matrix as a priori constraint condition.

The pose determining device provided by the embodiment of the present disclosure may perform the steps performed in the pose determining method provided by the embodiment of the present disclosure, and the performing steps and the beneficial effects are not described herein.

Fig. 3 is a schematic structural diagram of an electronic device in an embodiment of the disclosure. Referring now in particular to fig. 3, a schematic diagram of an electronic device 300 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device 300 in the embodiments of the present disclosure may include, but is not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), wearable terminal devices, and the like, and fixed terminals such as digital TVs, desktop computers, smart home devices, and the like. The electronic device shown in fig. 3 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 3, the electronic device 300 may include a processing means (e.g., a central processor, a graphics processor, etc.) 301 that may perform various suitable actions and processes to implement the picture rendering method of the embodiments as described in the present disclosure according to a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage means 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the terminal apparatus 300 are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.

In general, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 308 including, for example, magnetic tape, hard disk, etc.; and communication means 309. The communication means 309 may allow the terminal device 300 to communicate with other devices wirelessly or by wire to exchange data. While fig. 3 shows a terminal device 300 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts, thereby implementing the pose determination method as described above. In such an embodiment, the computer program may be downloaded and installed from a network via a communication device 309, or installed from a storage device 308, or installed from a ROM 302. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing means 301.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer-readable medium carries one or more programs which, when executed by the terminal device, cause the terminal device to: acquiring a camera frame in a sliding window; determining pose information to be optimized corresponding to the camera frame; acquiring issuing pose information corresponding to the key frames in the sliding window; determining prior constraint conditions based on the issuing pose information and the pose to be optimized; and carrying out sliding window optimization on the pose information to be optimized based on the prior constraint condition to obtain target pose information corresponding to the camera frame.

Alternatively, the terminal device may perform other steps described in the above embodiments when the above one or more programs are executed by the terminal device.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, the present disclosure provides a pose determination method, including: acquiring a camera frame in a sliding window; determining pose information to be optimized corresponding to the camera frame; acquiring issuing pose information corresponding to the key frames in the sliding window; determining prior constraint conditions based on the issuing pose information and the pose to be optimized; and carrying out sliding window optimization on the pose information to be optimized based on the prior constraint condition to obtain target pose information corresponding to the camera frame.

According to one or more embodiments of the present disclosure, there is provided a pose determination apparatus, the apparatus including: the camera frame acquisition module is used for acquiring the camera frames in the sliding window; the pose to be optimized determining module is used for determining pose information to be optimized corresponding to the camera frame; the issuing pose information acquisition module is used for acquiring issuing pose information corresponding to the key frames in the sliding window; the constraint condition determining module is used for determining a priori constraint condition based on the issuing pose information and the pose to be optimized; and the target pose determining module is used for carrying out sliding window optimization on the pose information to be optimized based on the prior constraint condition to obtain the target pose information corresponding to the camera frame.

According to one or more embodiments of the present disclosure, the present disclosure provides an electronic device comprising:

one or more processors;

a memory for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement any of the pose determination methods as provided by the present disclosure.

According to one or more embodiments of the present disclosure, the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a pose determination method as any of the present disclosure provides.

The disclosed embodiments also provide a computer program product comprising a computer program or instructions which, when executed by a processor, implements the pose determination method as described above.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. A pose determination method applied to a visual inertial odometer, comprising:

Acquiring a camera frame in a sliding window;

determining pose information to be optimized corresponding to the camera frame;

2. The method according to claim 1, wherein the method further comprises:

determining a first frame in the sliding window as a key frame;

and sending the local pose information corresponding to the key frame to the positioning server so that the positioning server processes the local pose information to obtain the issuing pose information corresponding to the key frame.

3. The method of claim 2, wherein obtaining the issuing pose information corresponding to the keyframes in the sliding window comprises:

4. The method of claim 2, wherein the processing the local pose information by the positioning server to obtain the issued pose information corresponding to the key frame includes:

and if the local pose information is image data, the positioning server processes the image data based on map data stored by a local end and a PnP algorithm to obtain issuing pose information corresponding to the key frame.

5. The method of claim 2, wherein the processing the local pose information by the positioning server to obtain the issued pose information corresponding to the key frame includes:

if the local pose information is a pose matrix, the positioning server acquires a timestamp corresponding to the pose matrix;

and the positioning server acquires positioning information corresponding to the global positioning system based on the time stamp corresponding to the pose matrix and takes the positioning information as issuing pose information corresponding to the key frame.

6. The method of claim 2, wherein determining a priori constraint based on the issued pose information and the pose to be optimized comprises:

and determining prior constraint conditions based on the issuing pose information, the local pose information and the pose to be optimized.

7. The method of claim 6, wherein the issuing pose information comprises an issuing pose matrix, the local pose information comprises a local pose matrix, and the pose information to be optimized comprises a pose matrix to be optimized;

determining a priori constraint condition based on the issuing pose information, the local pose information and the pose to be optimized comprises the following steps:

and determining the product of the issuing pose matrix, the local pose matrix and the current pose matrix as a priori constraint condition.

8. A pose determination apparatus, the apparatus being configured in a positioning system including a visual odometer, comprising:

9. An electronic device, the electronic device comprising:

one or more processors;

a storage means for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-7.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-7.

11. A computer program product comprising a computer program or instructions which, when executed by a processor, implements the method of any of claims 1-7.