CN115937383B

CN115937383B - Method, device, electronic equipment and storage medium for rendering image

Info

Publication number: CN115937383B
Application number: CN202211153154.5A
Authority: CN
Inventors: 温佳伟; 郭亨凯
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-09-21
Filing date: 2022-09-21
Publication date: 2023-10-10
Anticipated expiration: 2042-09-21
Also published as: CN115937383A

Abstract

The embodiment of the disclosure provides a method, a device, electronic equipment and a storage medium for rendering an image, wherein the method comprises the following steps: when the current frame is detected to be updated to the sliding window as the key frame to be used and the key frame to be applied overflows from the sliding window, determining the point cloud data to be processed associated with the key frame to be applied; determining target point cloud data from the point cloud data to be processed according to each key frame to be used in the sliding window; updating the constraint set based on the sliding window, the target point cloud data and the key frame to be applied; and when the next frame is received, determining the optimal pose of the next frame according to the constraint set and the sliding window so as to perform image rendering based on the optimal pose. According to the technical scheme, through expanding the positioning constraint conditions of the SLAM system, the positioning accuracy of the system is improved, the rendering effect of the image is optimized, and meanwhile, the instantaneity of the system on image processing is guaranteed.

Description

Method, device, electronic equipment and storage medium for rendering image

Technical Field

Embodiments of the present disclosure relate to image processing technologies, and in particular, to a method, an apparatus, an electronic device, and a storage medium for rendering an image.

Background

With the development of computer vision technology, the synchronous positioning and mapping (Simultaneous Localization and Mapping, SLAM) algorithm is widely applied to the fields of augmented reality, virtual reality, automatic driving, positioning navigation of robots or unmanned aerial vehicles, and the like.

In the prior art, according to the SLAM algorithm, various types of systems may be constructed to perform corresponding rendering tasks, such as a filtering-based SLAM system, a feature point-based SLAM system, and the like. However, in the practical application process, the system based on the SLAM algorithm performs analysis processing on the received video image to implement the pose tracking process, and when determining the optimal pose of each frame, the system needs to determine according to the pose of the previous frame, which has the following drawbacks: along with the continuous change of the initial pose information in each frame, a certain accumulated error may exist between the initial positioning precision and the final positioning precision, so that the image effect obtained by the system rendering is poor, and the use experience of a user is affected.

Disclosure of Invention

The present disclosure provides a method, an apparatus, an electronic device, and a storage medium for rendering an image, so as to implement expansion of positioning constraint conditions of a SLAM system, improve positioning accuracy of the system, optimize rendering effects of the image, and simultaneously ensure real-time performance of the system on image processing.

In a first aspect, embodiments of the present disclosure provide a method of rendering an image, the method comprising:

when the current frame is detected to be updated to a sliding window as a key frame to be used and the key frame to be applied overflows from the sliding window, determining the point cloud data to be processed associated with the key frame to be applied;

determining target point cloud data from the point cloud data to be processed according to each key frame to be used in the sliding window;

updating a constraint set based on the sliding window, the target point cloud data and the key frame to be applied;

and when the next frame is received, determining the optimal pose of the next frame according to the constraint set and the sliding window so as to perform image rendering based on the optimal pose.

In a second aspect, embodiments of the present disclosure also provide an apparatus for rendering an image, the apparatus comprising:

the system comprises a to-be-processed point cloud data determining module, a processing module and a processing module, wherein the to-be-processed point cloud data determining module is used for determining to-be-processed point cloud data associated with a to-be-applied key frame when the current frame is detected to be updated to a sliding window as the to-be-used key frame and the to-be-applied key frame overflows from the sliding window;

the target point cloud data determining module is used for determining target point cloud data from the point cloud data to be processed according to each key frame to be used in the sliding window;

The constraint set updating module is used for updating a constraint set based on the sliding window, the target point cloud data and the key frame to be applied;

and the optimization pose determining module is used for determining the optimization pose of the next frame according to the constraint set and the sliding window when the next frame is received, so as to perform image rendering based on the optimization pose.

In a third aspect, embodiments of the present disclosure further provide an electronic device, including:

one or more processors;

storage means for storing one or more programs,

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement a method of rendering an image as described in any of the embodiments of the present disclosure.

In a fourth aspect, the disclosed embodiments also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing a method of rendering an image as described in any of the disclosed embodiments.

According to the technical scheme, when the current frame is detected to be updated to the sliding window as the key frame to be used and the key frame to be applied overflows from the sliding window, the point cloud data to be processed associated with the key frame to be applied is determined, then, according to the key frames to be used in the sliding window, the target point cloud data is determined from the point cloud data to be processed, further, the constraint set is updated based on the sliding window, the target point cloud data and the key frame to be applied, finally, when the next frame is received, the optimal pose of the next frame is determined according to the constraint set and the sliding window, so that image rendering is performed based on the optimal pose, through expansion of the positioning constraint condition of the SLAM system, the positioning accuracy of the system is improved, the image rendering effect is optimized, and meanwhile, the real-time performance of the system on image processing is guaranteed.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

FIG. 1 is a flow chart of a method of rendering an image according to an embodiment of the present disclosure;

FIG. 2 is a flow chart of a method of rendering an image according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of an apparatus for rendering an image according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type, usage range, usage scenario, etc. of the personal information related to the present disclosure in an appropriate manner according to the relevant legal regulations.

For example, in response to receiving an active request from a user, a prompt is sent to the user to explicitly prompt the user that the operation it is requesting to perform will require personal information to be obtained and used with the user. Thus, the user can autonomously select whether to provide personal information to software or hardware such as an electronic device, an application program, a server or a storage medium for executing the operation of the technical scheme of the present disclosure according to the prompt information.

As an alternative but non-limiting implementation, in response to receiving an active request from a user, the manner in which the prompt information is sent to the user may be, for example, a popup, in which the prompt information may be presented in a text manner. In addition, a selection control for the user to select to provide personal information to the electronic device in a 'consent' or 'disagreement' manner can be carried in the popup window.

It will be appreciated that the above-described notification and user authorization process is merely illustrative and not limiting of the implementations of the present disclosure, and that other ways of satisfying relevant legal regulations may be applied to the implementations of the present disclosure.

It will be appreciated that the data (including but not limited to the data itself, the acquisition or use of the data) involved in the present technical solution should comply with the corresponding legal regulations and the requirements of the relevant regulations.

Before the present technical solution is introduced, an application scenario may be illustrated. The method includes the steps that when a user shoots a video by using a camera device of a mobile terminal and uploads the shot video to a system based on a SLAM algorithm, or after a target video is selected in a database and actively uploaded to the system based on the SLAM algorithm, the system can analyze the video, however, the existing SLAM system determines the pose of a current frame based on the pose of a previous frame and determines the pose of a next frame based on the pose of the current frame, so that pose tracking and positioning are realized, in the process, along with continuous change of camera pose information, a certain positioning accumulated error exists after the whole calculation process is finished, so that the positioning drift of the system is caused, and because of the accumulated error, the positioning scale setting process of the system is influenced in the tracking and positioning process, the scale of the whole system is inaccurate, and more accurate positioning and tracking cannot be realized. At this time, according to the scheme of the embodiment of the disclosure, the key frame to be applied and the corresponding target point cloud data overflowed in the sliding window can be added to the constraint set, further, after the constraint set is updated, the optimal pose of the next frame is determined based on constraint combination and the sliding window together, and image rendering is performed based on the optimal pose, and by expanding constraint conditions for pose optimization based on the SLAM system, the positioning precision of the SLAM space is improved, a more excellent rendering result is obtained, and meanwhile, the integrity and consistency of tracking and positioning of the system are improved.

Fig. 1 is a schematic flow chart of a method for rendering an image according to an embodiment of the present disclosure, where the embodiment of the present disclosure is applicable to processing a video based on a SLAM system, expanding an existing pose constraint condition, so as to determine a situation of optimizing a pose based on the expanded constraint condition, where the method may be performed by an apparatus for rendering an image, where the apparatus may be implemented in a form of software and/or hardware, and optionally, may be implemented by an electronic device, where the electronic device may be a mobile terminal, a PC end, a server, or the like.

As shown in fig. 1, the method includes:

s110, when the current frame is detected to be updated to the sliding window as the key frame to be used and the key frame to be applied overflows from the sliding window, determining the point cloud data to be processed associated with the key frame to be applied.

In this embodiment, the apparatus for executing the method for rendering an image provided by the embodiment of the present disclosure may be integrated in application software supporting a special effect video processing function, and the software may be installed in an electronic device, where the electronic device may be a mobile terminal or a PC terminal, and so on. The application software may be a type of software for image/video processing, and specific application software thereof is not described herein in detail, as long as image/video processing can be implemented. The method can also be a specially developed application program to realize the addition of special effects and the display of the special effects, or be integrated in a corresponding page, and a user can realize the processing of the special effect video through the page integrated in the PC terminal.

It should be noted that, the technical solution of the present embodiment may be executed in a process of capturing images in real time based on the mobile terminal, or may be executed after the system receives video data actively uploaded by the user, and meanwhile, the solution of the present disclosure may be applied to various application scenarios such as augmented Reality (Augmented Reality, AR), virtual Reality (VR), and autopilot.

In this embodiment, when video data is received or acquired, the video data may be divided into a plurality of video frames, and the frames may be ordered according to the time stamp order on each frame, and any one of the frames may be used as the current frame. The current frame may be the first frame or may be a non-first frame. If the current frame is the first frame, the corresponding video frame is not acquired based on the image acquisition equipment, and the video frame can be used as a key frame to be used; if the current frame is a non-initial frame, it can be determined whether the frame is a key frame, and when the current frame is determined to be the key frame, the key frame is used as the key frame to be used.

In this embodiment, the sliding window may be a sliding window of adaptive size that is pre-constructed. A preset number of key frames to be used may be stored in the sliding window. When the number of video frames in the sliding window reaches a preset number threshold and a new key frame to be used enters the sliding window, a certain frame pre-stored in the sliding window can be moved out of the sliding window, so that the new key frame to be used can be stored in the sliding window, the key frame to be used pre-stored in the sliding window can be used as the key frame to be applied, and the key frame to be used which is moved out of the sliding window later.

In practical application, when the current frame is determined to be the key frame, the current frame can be added into the sliding window as the key frame to be used, and when the number of the key frames to be used in the sliding window is detected to reach the preset number threshold, one of the key frames to be used can be overflowed from the sliding window as the key frame to be applied.

It should be noted that, before determining whether the current frame is a key frame, the system may further perform triangulation processing on the received current frame to obtain point cloud data corresponding to the current frame. In this embodiment, when the current frame is triangulated, a KLT corner detection method, also referred to as KLT optical flow tracking method, is generally used. Specifically, the KLT corner detection method selects a reference key frame suitable for tracking from each key to be used, and determines a feature point of the reference key frame, so that point cloud data of a current frame is determined based on the feature point, and therefore, when point cloud data to be processed associated with the key frame to be applied is determined, point cloud data of all other key frames associated with the key frame to be applied can be taken as the point cloud data to be processed and called.

Optionally, determining the point cloud data to be processed associated with the key frame to be applied includes: and invoking the point cloud data to be processed by taking the key frame to be applied as the reference key frame during triangularization.

In this embodiment, the triangulation may be to determine point cloud data in the current frame based on a corner detection algorithm. When the system receives the current frame, point Cloud Data (PCD) in the current frame needs to be determined based on a corner detection algorithm. The point cloud data is generally used in reverse engineering, is data recorded in a form of points, and the points can be coordinates in a three-dimensional space, can also be information such as color or illumination intensity, and generally further comprises contents such as point coordinate precision, spatial resolution, surface normal vector and the like in an actual application process, and is generally stored in a PCD format.

Before the triangularization process, feature points corresponding to the key frames to be used in the sliding window may adopt an inverse depth representation form, so as to implement parameterization of the key frames to be used.

In practical application, the point cloud data of the current frame is determined based on a corner detection algorithm, and the determination is realized based on one or more frames of key frames to be used in the sliding window as reference frames, wherein the reference frame closest to the current moment in the reference frames is taken as a reference key frame, and the point cloud data in the reference key frame is taken as point cloud data to be processed and extracted. The advantages of this arrangement are that: the integrity and consistency of system positioning and tracking can be ensured.

S120, determining target point cloud data from the point cloud data to be processed according to each key frame to be used in the sliding window.

In this embodiment, after determining the point cloud data to be processed associated with the key frame to be applied, the point cloud data to be processed may be screened according to each key frame to be used, to determine whether there is a connection between the point cloud data to be processed and each key frame to be used, and if there is a connection, the point cloud data to be processed may be associated with the corresponding key frame to be used; if no connection exists, the point cloud data to be processed can be extracted to serve as target point cloud data.

Optionally, determining target point cloud data from the point cloud data to be processed according to each key frame to be used in the sliding window includes: for each point cloud data to be processed, determining whether a reference key frame exists in the sliding window of the current point cloud data to be processed; if not, determining the current point cloud data to be processed as the target point cloud data.

In practical application, for each point cloud data to be processed, screening the current point cloud data to be processed according to each key frame to be used in the sliding window, and determining whether each key frame to be used in the sliding window has a reference key frame corresponding to the current point cloud data to be processed; if not, the current point cloud data to be processed can be used as target point cloud data; if so, the current point cloud data to be processed can be associated with the key frame to be used, in which the current point cloud data to be processed can be observed, in the sliding window, so as to serve as the point cloud data in the key frame to be used. The advantages of this arrangement are that: the subsequent calculation efficiency can be effectively controlled by controlling the number of the local map points, so that the synchronous positioning and the real-time image processing of the image building system are facilitated.

It should be noted that, determining whether the current point cloud data to be processed has the reference key frame in the sliding window may be implemented by determining whether each point cloud data to be used in the sliding window can observe the current point cloud data to be processed.

Optionally, determining whether the reference key frame exists in the sliding window for the current point cloud data to be processed includes: determining whether current point cloud data to be processed can be observed in each key frame to be used in sequence from small to large according to the time length information of the key frames to be used in the sliding window from the current time; and the key frame to be used of the current point cloud data to be processed is observed for the first time and is used as a reference key frame.

In this embodiment, determining a time stamp of each key frame to be used in the sliding window, comparing each time stamp with the current time, taking a time length difference value between each time stamp and the previous time as time length information of the corresponding key frame to be used, sequentially determining whether the key frame to be used can observe current point cloud data to be processed according to a sequence from small to large of the time length information, and taking the key frame to be used, from which the current point cloud data to be processed is observed for the first time, as a reference key frame. Specifically, firstly, determining whether the key frame to be used closest to the current moment can observe the current point cloud data to be processed; if yes, the key frame to be used is used as a reference key frame; if not, determining whether the key frame to be used, which is the second closest to the current moment, can observe the current point cloud data to be processed; if yes, the key frame to be used is used as a reference key frame; if not, continuing to determine whether the key frame to be used third closest to the current moment can observe the current point cloud data to be processed; and so on until all key frames in the sliding window to be used are traversed once. The advantages of this arrangement are that: the method and the device can rapidly and accurately screen the reference key frame associated with each point cloud data to be processed from the key frames to be used, and associate the reference key frame with the corresponding point cloud data to be processed, so that the effects of reducing the subsequent point cloud data calculation process and improving the calculation efficiency can be achieved.

S130, updating the constraint set based on the sliding window, the target point cloud data and the key frame to be applied.

In this embodiment, after determining key frames to be applied from each key frame to be used and removing the key frames to be applied from the sliding window, since the key frames to be applied are all key frames which are screened from a plurality of video frames before and stored in the sliding window, and after being optimized for a plurality of times, the corresponding feature points of the key frames to be applied are fixed, even if the key frames to be applied are removed from the sliding window, the pose of the video frames received subsequently can be optimized, therefore, when the key frames to be applied overflow from the sliding window, a new set can be entered, and the pose of the video frames received subsequently can be optimized based on each key frame to be applied in the set, so as to update the pose of the video frames. It can be appreciated that in this embodiment, the set that the key frame to be applied enters after overflowing the sliding window is a constraint set, which may also be referred to as a local map.

It should be noted that, when the key frame to be applied enters the constraint set, the target point cloud data associated with the key frame to be applied may also be added to the constraint set, and, with each key frame to be used in the sliding window being continuously updated, in order to ensure consistency and accuracy of focus tracking, the constraint set may also process the key frame to be applied, so as to implement updating optimization of the constraint set.

Optionally, updating the constraint set based on the sliding window, the target point cloud data, and the key frame to be applied includes: updating the target point cloud data and the key frames to be applied into a constraint set; updating the updated constraint set based on the key frame to be used in the sliding window and the key frame to be applied in the updated constraint set.

In practical application, after determining the key frame to be applied and the target point cloud data, the key frame to be applied and the target point cloud data can be added into the constraint set, so that the constraint set is updated; further, since the sliding window is in a dynamic update state, that is, when the system determines that the current frame is a key frame, the key frame can be added into the sliding window, and one frame in the sliding window is removed, in order to keep the connection between the sliding window and the constraint set all the time, the key frame to be applied in the constraint set can be processed based on the key frame to be used in the sliding window, so that the updated constraint set is updated. The advantages of this arrangement are that: the relation between the sliding window and the constraint set can be established all the time, so that the optimization precision of the pose is improved, and the positioning precision of the system is further improved.

In this embodiment, as the system continuously adds new key frames to the sliding window, the feature point data of the key frames in the sliding window will also change accordingly, and whether to delete the key frames to be applied from the constraint set can be determined according to whether there is a common feature point between each key frame to be used in the sliding window and each key frame to be applied in the constraint set, so as to implement updating of the updated constraint set.

Optionally, updating the updated constraint set based on the key frame to be used in the sliding window and the key frame to be applied in the updated constraint set includes: obtaining a target to-be-used key frame with the longest time interval duration from the current time in a sliding window, and a target to-be-applied key frame with the longest time interval duration from the current time in a constraint set; determining common viewpoint data of a target key frame to be used and a target key frame to be applied; and if the common view point data does not meet the preset condition, deleting the target key frame to be applied from the constraint set so as to update the constraint set.

In this embodiment, determining a timestamp of each key frame to be used in the sliding window, and taking the key frame to be used with the longest interval with the current time as a target key frame to be used; and simultaneously, determining the time stamp of each key frame to be applied in the constraint set, and taking the key frame to be applied with the longest interval with the current moment as a target key frame to be applied.

Wherein, the common viewpoint data refers to some feature points shared by the target to-be-used key frame and the target to-be-applied key frame. For example, the feature points of the key frame to be used by the target include: a. b, c, f, g, the feature points of the target key frame to be applied include: a. d, k, f, m, the feature points a and f are the common viewpoint data of the target key frame to be used and the target key frame to be applied.

In practical application, a similarity algorithm can calculate a similarity value between a feature point in a key frame to be applied to the target and a feature point in the key frame to be used by the target, if the similarity value between the two feature points reaches a certain threshold value, the two points can be determined to be quite similar, and the point in the key frame to be applied to the target can be considered to be a common viewpoint with the key frame to be used by the target.

Further, when no common view exists between the target key frame to be applied and the target key frame to be used, that is, the common view data does not meet the preset condition, the target key frame to be applied can be deleted from the constraint set, so that the constraint set is updated. The advantages of this arrangement are that: the consistency and the accuracy of focus tracking can be ensured, and the real-time performance of the system on image processing is further ensured.

And S140, when the next frame is received, determining the optimal pose of the next frame according to the constraint set and the sliding window so as to perform image rendering based on the optimal pose.

In this embodiment, a video frame closest to the acquisition time of the current frame may be used as the next frame, and the initial pose of the next frame is determined, i.e., the initial pose of the next frame may be optimized according to the constraint set and the sliding window, so as to obtain the optimized pose of the next frame.

Optionally, determining the optimized pose of the next frame according to the constraint set and the sliding window includes: the key frames to be applied, the target point cloud data and the key frames to be used and the point cloud data to be processed in the sliding window in the constraint set are used as constraint conditions for determining the pose of the next frame; based on the constraint conditions, an optimized pose of the next frame is determined.

In practical application, the next frame can be optimized based on the beam adjustment method, and meanwhile, key frames to be applied, target point cloud data and point cloud data to be used in a sliding window in a constraint set can be used as constraint conditions of the beam adjustment method, so that the optimal pose of the next frame is finally determined.

It should be understood by those skilled in the art that the map optimization with camera pose and space points is called BA, and can effectively solve the problem of positioning and map construction in a large range, however, as the scale is continuously enlarged, the calculation efficiency is greatly reduced, in this process, the optimization problem of the feature points occupies a larger part, after several iterations, the feature points are converged, and at this time, the optimization has no greater meaning. Therefore, in the actual process, after the feature points are optimized several times, the feature points can be fixed and can be regarded as constraints of pose estimation, namely, the pose of the feature points is not optimized any more. Based on this, it can be understood that the optimized pose graph is that under the condition of considering only the pose, the constructed graph with only the track is optimized, and the edge between the pose nodes is given an initial value by the motion estimation obtained after feature matching between two key frames, once the initial value is determined, the position of the road mark point is not optimized any more, and only the relation between the camera poses is concerned.

In this embodiment, the incremental BA problem construction method is used to optimize the initial pose of the current frame, so that the synchronous positioning and mapping system can provide a higher BA speed, thereby ensuring the real-time performance of the system in processing each video frame.

The beam adjustment method (Bundle Adjustment) can be used as a standard according to the projection of all points in an image, and simultaneously provides 3D point coordinates describing a scene structure, relative motion parameters and optical parameters of a camera. It can be understood that, for any three-dimensional point P in the scene, the light rays emitted from the optical center of the camera corresponding to each view and passing through the pixel corresponding to P in the image will meet the point P, and for all three-dimensional points, a considerable light beam is formed, and in the practical application process, due to the existence of noise and other factors, each light ray is almost unlikely to converge on a point, so that in the solving process, the information to be solved needs to be continuously adjusted, so that the final light ray can meet the point P. It will be appreciated that the final objective of the beam adjustment method is to reduce the error in the position projection transformation between each key frame to be applied as an observation image and the points of the reference or predicted image, thereby obtaining the best three-dimensional structure and motion (e.g. camera matrix) parameter estimation.

The beam adjustment method generally uses sparsity of the BA model for calculation, and the calculation process may involve a steepest descent method, a Newton type method, an LM method, and the like, which is not particularly limited in the embodiments of the present disclosure.

In practical application, when the next frame is optimized based on the beam adjustment method, the key frame to be used, the point cloud data to be processed in the sliding window, the key frames to be applied in the constraint set and the target point cloud data are taken as constraint conditions, and the optimal pose of the next frame can be determined. The advantages of this arrangement are that: the method can provide more effective constraint for the pose optimization process, improves the pose optimization precision, and further improves the integrity and consistency of system tracking.

In this embodiment, after determining the optimal pose of the next frame, image rendering may be performed based on the optimal pose. For example, the map data may be updated based on the optimized pose, or a specific rendering position of the target object in the AR scene may be determined based on the optimized pose.

It should be noted that, after receiving the next frame, the next frame may be further used as a current frame and the key frame may be used as a historical key frame, so as to determine whether the next frame is a key frame based on the historical key frame.

Fig. 2 is a flowchart of a method for rendering an image according to an embodiment of the present disclosure, where, based on the foregoing embodiment, when a current frame is determined to be a key frame, the current frame is updated to a sliding window, and when the number of key frames to be used in the sliding window reaches a preset number threshold, the key frames to be applied are determined, so that the key frames to be applied overflow from the sliding window. The specific implementation manner can be seen in the technical scheme of the embodiment. Wherein, the technical terms identical to or corresponding to the above embodiments are not repeated herein.

As shown in fig. 2, the method specifically includes the following steps:

and S210, when the current frame is determined to be the key frame, updating the current frame serving as the key frame to be used into the sliding window.

In this embodiment, there are various ways of determining whether the current frame is a key frame, and various ways are described one by one.

Optionally, when determining that the current frame is a key frame, determining a target feature point of the current frame and displacement parallax between the current frame and at least one key frame to be applied; and when the number of the target feature points reaches a first preset number threshold and the displacement parallax is larger than the first preset displacement parallax threshold, determining that the current frame is a key frame.

Because the camera is in a state of continuous motion, the shot object moves on the image, so that displacement parallax is formed, and it can be understood that the distance between the object in each frame of image can be judged at least through the displacement parallax. The target feature points are points determined from objects in each frame of image, for example, a certain frame of image has multiple steps, and the system can determine a plurality of corresponding feature points from each step according to a pre-trained feature point determination algorithm, wherein the feature points are target feature points. In this embodiment, the system may also preset a threshold for the parameter, i.e. the number of target feature points, which is a first preset number threshold, and similarly preset a threshold for the parameter, i.e. the displacement parallax, which is a first preset displacement parallax threshold. Based on the above, when the system determines the target feature point from the current frame and determines the displacement parallax between the current frame and at least one key frame to be applied, the number of the target feature points and the displacement parallax value can be judged, and when both the number of the target feature points and the displacement parallax value are larger than the corresponding preset threshold, the current frame is determined to be the key frame.

Optionally, determining common view feature points of the current frame and at least one key frame to be applied, and determining target feature points based on downsampling processing of the common view feature points in the current frame; determining displacement deviation of the current frame and the key frame to be applied; and if the number of the target feature points is smaller than the number of the feature points to be processed in the current frame and the displacement deviation is smaller than a second preset displacement deviation, determining that the current frame is a key frame.

After receiving the current frame and determining a plurality of feature points in the picture, the system can also compare the feature points with feature points in the picture corresponding to at least one key frame to be applied, so as to determine common-view feature points in the images. In this embodiment, the system may also preset a threshold for the displacement deviation parameter, which is the second preset displacement deviation. Based on the above, when the system determines the target feature point from the current frame and determines the displacement deviation between the current frame and the key frame to be applied, the number of the target feature point can be compared with the number of the feature points to be processed in the current frame, and meanwhile, the displacement deviation between the current frame and the key frame to be applied is compared with the second preset displacement deviation, and when the two parameters are smaller than the corresponding comparison objects, the current frame is determined to be the key frame.

Optionally, performing downsampling processing on the point cloud data to be processed of the current frame to obtain target feature points; determining displacement deviation of a current frame and at least one key frame to be applied; if the number of the target feature points is smaller than or equal to the number of the common-view feature points and the displacement deviation is smaller than a third preset displacement deviation, determining the current frame as a key frame;

in this embodiment, the system may preset a threshold for the parameter of the displacement deviation, where the threshold is a third preset displacement deviation, further, the number of target feature points is compared with the number of common-view feature points between the multiple video frames, and the displacement deviation is compared with the third preset displacement deviation, and when the two parameters are smaller than the corresponding comparison objects, the current frame is determined to be the key frame.

In practical application, after determining that the current frame is a key frame, the current frame can be added into the sliding window as the key frame to be used, so that updating of the sliding window is realized. This process can be understood as an update optimization of the local map information.

S220, determining key frames to be applied from the key frames to be used according to the time stamps of the key frames to be used when the number of the key frames to be used in the sliding window reaches a preset number threshold.

In practical application, after determining that the current frame is a key frame, the current frame may be used as the key frame to be used and added to the sliding window, so as to update the sliding window, and in the scheme of the embodiment, in order to ensure consistency and accuracy of focus tracking, a sliding window structure is adopted to maintain local map information, where the sliding window may include a plurality of adjacent frames and space points observed based on the adjacent frames. Based on the above, in the process of locally optimizing the map information, the current frame needs to be constrained by using the history information, so that the accuracy of the optimization result of the local map information is improved, in the practical application process, a frame number can be preset for the sliding window structure, and meanwhile, the frame number also determines the number of key frames in the sliding window.

If the number of the key frames to be used in the sliding window reaches a preset number threshold, determining a time stamp of each key frame to be used in the sliding window, taking the key frame to be used with the longest interval with the current time as a key frame to be applied, and removing the key frame from the sliding window.

For example, when the sliding window may include at most 10 adjacent key frames as key frames to be used, the preset number threshold is 10, and after the synchronous positioning and mapping system adds the current frame to the sliding window, in order to ensure that the number of key frames to be used in the sliding window is equal to the preset number threshold, among the 10 key frames to be used as historical key frames, one key frame to be used with the highest carried timestamp needs to be selected, and the key frame to be used is used as the key frame to be applied.

S230, overflowing the key frame to be applied from the sliding window, and determining the point cloud data to be processed associated with the key frame to be applied.

In practical application, in order to ensure the consistency of the size of the sliding window, after the key frames to be applied are screened out from the sliding window, the key frames to be applied overflow from the sliding window, and the point cloud data to be processed associated with the key frames to be applied are determined based on each key frame to be used in the sliding window.

S240, determining target point cloud data from the point cloud data to be processed according to each key frame to be used in the sliding window.

S250, updating the constraint set based on the sliding window, the target point cloud data and the key frame to be applied.

And S260, when the next frame is received, determining the optimal pose of the next frame according to the constraint set and the sliding window so as to perform image rendering based on the optimal pose.

According to the technical scheme, when the current frame is determined to be the key frame, the current frame is updated to the sliding window as the key frame to be used, then when the number of the key frames to be used in the sliding window reaches a preset number threshold, the key frame to be applied is determined from the key frames to be used according to the time stamp of each key frame to be used, the key frame to be applied overflows from the sliding window, the point cloud data to be processed associated with the key frame to be applied is determined, further, the target point cloud data is determined from the point cloud data to be processed according to each key frame to be used positioned in the sliding window, and finally, the constraint set is updated based on the sliding window, the target point cloud data and the key frame to be applied, and finally, when the next frame is received, the optimal pose of the next frame is determined according to the constraint set and the sliding window, so that image rendering is performed based on the optimal pose, the number of the key frames in the sliding window is strictly controlled, the number of local map points is also indirectly controlled, the subsequent calculation efficiency is effectively controlled, the synchronous positioning and the image processing effect of a mobile system is convenient, and the positioning and image processing precision of the system is improved, and the positioning and the image rendering effect of the system is optimized is improved.

Fig. 3 is a schematic structural diagram of an apparatus for rendering an image according to an embodiment of the disclosure, as shown in fig. 3, where the apparatus includes: the system comprises a point cloud data to be processed determining module 310, a target point cloud data determining module 320, a constraint set updating module 330 and an optimization pose determining module 340.

A to-be-processed point cloud data determining module 310, configured to determine to-be-processed point cloud data associated with a to-be-applied key frame when it is detected that a current frame is updated to a sliding window as the to-be-used key frame and the to-be-applied key frame overflows from the sliding window;

the target point cloud data determining module 320 is configured to determine target point cloud data from the point cloud data to be processed according to each key frame to be used in the sliding window;

a constraint set updating module 330, configured to update a constraint set based on the sliding window, the target point cloud data, and the key frame to be applied;

and the optimization pose determining module 340 is configured to determine, when a next frame is received, an optimization pose of the next frame according to the constraint set and the sliding window, so as to perform image rendering based on the optimization pose.

Based on the above technical solutions, the point cloud data determining module 310 to be processed includes a key frame determining unit, a key frame determining unit to be applied, and a key frame overflow unit to be applied.

A key frame determining unit, configured to update a current frame as a key frame to be used into the sliding window when determining that the current frame is a key frame;

the key frame to be applied determining unit is used for determining key frames to be applied from the key frames to be used according to the time stamp of each key frame to be used when the number of the key frames to be used in the sliding window reaches a preset number threshold;

and the key frame to be applied overflows the key frame to be applied from the sliding window.

Based on the above technical solutions, the to-be-processed point cloud data determining module 310 includes a to-be-processed point cloud data retrieving unit.

And the to-be-processed point cloud data calling unit is used for calling to-be-processed point cloud data taking the to-be-applied key frame as a reference key frame in triangularization.

Based on the above technical solutions, the target point cloud data determining module 320 includes a reference key frame determining sub-module and a target point cloud data determining sub-module.

The reference key frame determining submodule is used for determining whether the reference key frame exists in the sliding window or not for the current point cloud data to be processed for each point cloud data to be processed;

and the target point cloud data determining submodule is used for determining that the current point cloud data to be processed is the target point cloud data if not.

On the basis of the technical schemes, the reference key frame determining submodule comprises a point cloud data observation unit to be processed and a reference key frame determining unit.

The point cloud data observation unit to be processed is used for sequentially determining whether the current point cloud data to be processed can be observed in each key frame to be applied from small to large according to the time length information of each key frame to be used in the sliding window from the current time;

and the reference key frame determining unit is used for taking the key frame to be applied of which the current point cloud data to be processed is observed for the first time as the reference key frame.

Based on the above technical solutions, the constraint set updating module 330 includes a constraint set first updating sub-module and a constraint set second updating sub-module.

A constraint set first updating sub-module, configured to update the target point cloud data and the key frame to be applied to the constraint set;

and the constraint set second updating sub-module is used for updating the updated constraint set based on the key frame to be used in the sliding window and the key frame to be applied in the updated constraint set.

On the basis of the technical schemes, the constraint set second updating sub-module comprises a target key frame to be used acquisition unit, a common view point data determination unit and a target key frame to be applied deletion unit.

The target to-be-used key frame acquisition unit is used for acquiring a target to-be-used key frame with the longest time interval duration from the current time in the sliding window and a target to-be-applied key frame with the longest time interval duration from the current time in the constraint set;

the common view point data determining unit is used for determining common view point data of the target to-be-used key frame and the target to-be-applied key frame;

and the target to-be-applied key frame deleting unit is used for deleting the target to-be-applied key frame from the constraint set to update the constraint set if the common view point data does not meet the preset condition.

Based on the above technical solutions, the optimization pose determining module 340 includes a constraint condition determining unit and an optimization pose determining unit.

The constraint condition determining unit is used for taking each key frame to be applied, target point cloud data, key frames to be used in the sliding window and point cloud data to be processed in the constraint set as constraint conditions for determining the pose of the next frame;

and the optimization pose determining unit is used for determining the optimization pose of the next frame based on the constraint condition.

The image rendering device provided by the embodiment of the disclosure can execute the image rendering method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of executing the method.

It should be noted that each unit and module included in the above apparatus are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for convenience of distinguishing from each other, and are not used to limit the protection scope of the embodiments of the present disclosure.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. Referring now to fig. 4, a schematic diagram of an electronic device (e.g., a terminal device or server in fig. 4) 500 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 4 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 4, the electronic device 500 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 501, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other via a bus 504. An edit/output (I/O) interface 505 is also connected to bus 504.

In general, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 507 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 508 including, for example, magnetic tape, hard disk, etc.; and communication means 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 4 shows an electronic device 500 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or from the storage means 508, or from the ROM 502. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 501.

The electronic device provided by the embodiment of the present disclosure and the video determining method provided by the foregoing embodiment belong to the same inventive concept, and technical details not described in detail in the present embodiment may be referred to the foregoing embodiment, and the present embodiment has the same beneficial effects as the foregoing embodiment.

The present disclosure provides a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the method of rendering an image provided by the above embodiments.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to:

Alternatively, the computer-readable medium carries one or more programs that, when executed by the electronic device, cause the electronic device to:

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the unit does not in any way constitute a limitation of the unit itself, for example the first acquisition unit may also be described as "unit acquiring at least two internet protocol addresses".

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. A method of rendering an image, comprising:

and when the next frame is received, determining the optimized pose of the next frame according to the constraint set and the sliding window, and performing image rendering based on the optimized pose.

2. The method of claim 1, wherein the detecting that the current frame is updated to a sliding window as a key frame to be used and the key frame to be applied overflows from the sliding window comprises:

When the current frame is determined to be a key frame, updating the current frame serving as the key frame to be used into the sliding window;

when the number of key frames to be used in the sliding window reaches a preset number threshold, determining key frames to be applied from the key frames to be used according to the time stamp of each key frame to be used;

and overflowing the key frames to be applied from the sliding window.

3. The method of claim 1, wherein the determining the point cloud data to be processed associated with the key frame to be applied comprises:

and invoking the point cloud data to be processed by taking the key frame to be applied as a reference key frame in triangulation.

4. The method according to claim 1, wherein the determining target point cloud data from the target point cloud data according to each key frame to be used located in the sliding window includes:

for each point cloud data to be processed, determining whether a reference key frame exists in the sliding window of the current point cloud data to be processed;

if not, determining the current point cloud data to be processed as the target point cloud data.

5. The method of claim 4, wherein the determining whether the currently pending point cloud data has a reference keyframe in the sliding window comprises:

Determining whether the current point cloud data to be processed can be observed in each key frame to be applied from small to large according to the time length information of each key frame to be used in the sliding window from the current time;

and taking the key frame to be applied of which the current point cloud data to be processed is observed for the first time as the reference key frame.

6. The method of claim 1, wherein the updating the constraint set based on the sliding window, the target point cloud data, and the key frame to be applied comprises:

updating the target point cloud data and the key frame to be applied into the constraint set;

updating the updated constraint set based on the key frame to be used in the sliding window and the key frame to be applied in the updated constraint set.

7. The method of claim 6, wherein updating the updated set of constraints based on the key frames to be used in the sliding window and the key frames to be applied in the updated set of constraints comprises:

obtaining a target to-be-used key frame with the longest time interval duration from the current time in the sliding window, and a target to-be-applied key frame with the longest time interval duration from the current time in the constraint set;

Determining common viewpoint data of the target to-be-used key frame and the target to-be-applied key frame;

and if the common view data does not meet the preset condition, deleting the target key frame to be applied from the constraint set so as to update the constraint set.

8. The method of claim 1, wherein said determining an optimized pose for the next frame from the set of constraints to the sliding window comprises:

the key frames to be applied, the target point cloud data, the key frames to be used and the point cloud data to be processed in the sliding window in the constraint set are used as constraint conditions for determining the pose of the next frame;

and determining the optimized pose of the next frame based on the constraint condition.

9. An apparatus for rendering an image, comprising:

10. An electronic device, the electronic device comprising:

one or more processors;

storage means for storing one or more programs,

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of rendering an image as recited in any of claims 1-8.

11. A storage medium containing computer executable instructions for performing the method of rendering an image as claimed in any one of claims 1 to 8 when executed by a computer processor.