CN117397239A - Encoding and decoding method and device for 3D map - Google Patents

Encoding and decoding method and device for 3D map Download PDF

Info

Publication number
CN117397239A
CN117397239A CN202180098602.3A CN202180098602A CN117397239A CN 117397239 A CN117397239 A CN 117397239A CN 202180098602 A CN202180098602 A CN 202180098602A CN 117397239 A CN117397239 A CN 117397239A
Authority
CN
China
Prior art keywords
map
map point
point
current
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180098602.3A
Other languages
Chinese (zh)
Inventor
曹潇然
蔡康颖
涂晨曦
王培�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN117397239A publication Critical patent/CN117397239A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application provides a coding and decoding method and device of a 3D map. The encoding method of the 3D map comprises the following steps: acquiring a plurality of 3D map points contained in a 3D map, wherein the 3D map points have cross-modal correlation, and the fact that any 3D map point in the 3D map points has cross-modal correlation means that the sequence of any 3D map point in the 3D map points is associated with a plurality of attributes of the 3D map points, and the attributes of the 3D map points at least comprise 3D map point descriptors; predicting the data of the current 3D map point according to the data of at least one reference 3D map point to obtain residual data of the current 3D map point, wherein the data at least comprises a plurality of attributes, the plurality of 3D map points comprise the current 3D map point and at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point coded before the current 3D map point is coded; and encoding residual data of the current 3D map point to obtain a code stream. The method and the device can improve the transmission efficiency of the 3D map.

Description

Encoding and decoding method and device for 3D map Technical Field
The present disclosure relates to 3D map technologies, and in particular, to a method and an apparatus for encoding and decoding a 3D map.
Background
Virtual Reality (VR), augmented reality (augmented reality, AR), and Mixed Reality (MR) technologies are multimedia virtual scene technologies that have emerged in recent years. This class of technology can create virtual reality and superimpose the virtual reality and the real world, creating a new visual environment and interactive experience. In such applications, the electronic device needs to determine pose information of itself in the current environment, so as to accurately realize fusion of the virtual object and the real scene.
On the other hand, in applications such as automatic driving, autonomous navigation, automatic inspection of unmanned aerial vehicles, and industrial robots, vehicles such as vehicles, unmanned aerial vehicles, and robots need to determine the pose of the vehicle in the current environment by determining the pose of the electronic device mounted thereon, so as to perform accurate path planning, navigation, detection, and control.
For the above application, the problem that the pose of the electronic device in the current environment needs to be determined is that: the electronic equipment receives the 3D map of the environment from the server or other equipment, acquires visual information in the environment through the local sensor, and determines the current pose of the electronic equipment by combining the acquired visual information and the downloaded 3D map.
However, the original 3D map generally contains a huge amount of data, and the transmission of the map consumes a great amount of bandwidth and time, severely limiting the performance of the application and affecting the user experience.
Disclosure of Invention
The application provides a coding and decoding method and device for a 3D map, which are used for reducing the data volume of the 3D map, further reducing the transmission bandwidth and improving the transmission efficiency.
In a first aspect, the present application provides a method for encoding a 3D map, including: acquiring a plurality of 3D map points contained in a 3D map, wherein the 3D map points have cross-modal correlation, the cross-modal correlation of the 3D map points means that the sequence of any 3D map point in the 3D map points is associated with a plurality of attributes of the 3D map points, and the attributes of the 3D map points at least comprise 3D map point descriptors; predicting data of a current 3D map point according to data of at least one reference 3D map point to obtain residual data of the current 3D map point, wherein the data at least comprises the plurality of attributes, the plurality of 3D map points comprise the current 3D map point and the at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point coded before the current 3D map point is coded; and encoding residual data of the current 3D map point to obtain a code stream.
Wherein the 3D map point descriptor is a vector comprising at least one component. When a certain attribute of any one 3D map point in the plurality of 3D map points belongs to the plurality of attributes, the order of the any one 3D map point in the plurality of 3D map points is associated with the certain attribute. And carrying out encapsulation processing on the residual data of the current 3D map point to obtain a code stream. After the code stream is obtained, the code stream may be stored or transmitted to a decoding apparatus.
In the encoding method, the obtained residual data has smaller data quantity, and the storage space of the encoding device for storing the 3D map can be reduced when the encoding device stores the code stream; the encoding device can reduce the bandwidth occupancy rate when transmitting the 3D map and improve the transmission efficiency of the 3D map when transmitting the code stream.
In one possible implementation, the plurality of attributes of the 3D map points further includes a 3D map point spatial location. The spatial position of the 3D map point can be represented by X, Y, Z on a three-dimensional spatial axis, longitude and latitude, altitude, polar coordinates and the like.
In one possible implementation, the method further includes: the plurality of 3D map points are ordered according to the plurality of attributes of the plurality of 3D map points. The plurality of 3D map points may be ordered based on the similarity of the plurality of attributes between the plurality of 3D map points, and generally the more similar the plurality of attributes, the higher the similarity between two 3D map points may be considered.
In one possible implementation, the plurality of 3D map points are arranged in a sequence, and each element in the sequence corresponds to one 3D map point in the plurality of 3D map points; in the sequence, the plurality of attributes between two adjacent 3D map points are associated. Wherein the sequence may comprise: list, array, character string, etc., the embodiment of the present application does not limit the expression form of the sequence.
In the sequence obtained according to the above process, the plurality of attributes between two adjacent 3D map points are associated, and it can be generally considered that the more similar the plurality of attributes are, the higher the similarity between the two 3D map points is, so that the similarity between the two adjacent 3D map points is at least ensured based on the plurality of 3D map points sequenced by the plurality of attributes between the 3D map points, and the similarity between the plurality of 3D map points in the sequence is further improved as a whole.
In one possible implementation, the plurality of 3D map points are arranged in a topology tree, and each node in the topology tree corresponds to one 3D map point of the plurality of 3D map points; in the topology tree, the plurality of attributes are associated between a 3D map point located at a parent node position and a 3D map point located at a child node position. Wherein the topology tree may comprise: b tree, b+ tree, B tree, hash tree, etc., the embodiment of the present application does not limit the expression form of the topology tree.
In the topology tree obtained according to the above-mentioned process, the 3D map points located at the parent node position and the 3D map points located at the child node positions are associated with each other, and it can be generally considered that the more similar the plurality of attributes are, the higher the similarity between the two 3D map points is, so that the similarity between the two 3D map points located at the parent node position and the two 3D map points located at the child node positions is at least ensured to be higher based on the plurality of 3D map points sequenced by the plurality of attributes between the 3D map points, and the similarity between the plurality of 3D map points in the topology tree is further improved as a whole.
In one possible implementation, the method further includes: the at least one reference 3D map point is determined. The at least one reference 3D map point is used for predicting the current 3D map point, and the higher the similarity between the at least one reference 3D map point and the plurality of attributes of the current 3D map point is, the higher the similarity between the at least one reference 3D map point and the current 3D map point is, the better the prediction effect of the at least one reference 3D map point is, and thus the smaller the data volume of residual data obtained by subsequent prediction is.
When the plurality of 3D map points are arranged in a sequential manner and ordered in the manner described above, in one possible implementation, the determining the at least one reference 3D map point includes: a first 3D map point is determined as the at least one reference 3D map point, the first 3D map point being a 3D map point arranged in the sequence preceding the current 3D map point.
In this implementation, the first 3D map point is associated with a plurality of attributes of the current 3D map point, i.e., the first 3D map point has a higher similarity to the current 3D map point. And determining the first 3D map point as a reference 3D map point, and then predicting the current 3D map point according to the first 3D map point, so that the prediction effect can be improved, and the data volume of residual data obtained by prediction can be effectively reduced.
When the plurality of 3D map points are arranged in a sequential manner and ordered in the manner described above, in one possible implementation, the determining the at least one reference 3D map point includes: determining a first 3D map point and a second 3D map point as the at least one reference 3D map point, the first 3D map point being a 3D map point arranged in the sequence before the current 3D map point, the second 3D map point being a 3D map point arranged in the sequence before the first 3D map point.
In this implementation, a first 3D map point is associated with the current 3D map point plurality of attributes and a second 3D map point is associated with the first 3D map point plurality of attributes. That is, the similarity between the first 3D map point and the current 3D map point is higher, the similarity between the second 3D map point and the first 3D map point is higher, and then the similarity between the second 3D map point and the current 3D map point is also higher. And determining the first 3D map point and the second 3D map point as reference 3D map points, and subsequently combining the first 3D map point and the second 3D map point to predict the current 3D map, so that the accuracy of prediction can be improved, and the data quantity of residual data obtained by prediction can be further reduced.
When the plurality of 3D map points are arranged in a topology tree manner and ordered in the above manner, in one possible implementation, the determining the at least one reference 3D map point includes: a third 3D map point is determined as the at least one reference 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node position of the current 3D map point.
In this implementation, the third 3D map point is associated with a plurality of attributes of the current 3D map point, i.e., the third 3D map point has a higher similarity to the current 3D map point. And determining the third 3D map point as a reference 3D map point, and then predicting the current 3D map point according to the third 3D map point, so that the prediction effect can be improved, and the data volume of residual data obtained by prediction can be effectively reduced.
When the plurality of 3D map points are arranged in a topology tree manner and ordered in the above manner, in one possible implementation, the determining the at least one reference 3D map point includes: determining a third 3D map point and a fourth 3D map point as the at least one reference 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node position of the current 3D map point, the fourth 3D map point being a 3D map point located in the topology tree at a parent node position of the third 3D map point.
In this implementation, the third 3D map point has a higher similarity to the current 3D map point, and the fourth 3D map point has a higher similarity to the third 3D map point, and further, the fourth 3D map point has a higher similarity to the current 3D map point. And determining the third 3D map point and the fourth 3D map point as reference 3D map points, and subsequently combining the third 3D map point and the fourth 3D map point to predict the current 3D map, so that the accuracy of prediction can be improved, and the data quantity of residual data obtained by prediction can be further reduced.
In one possible implementation manner, the predicting the data of the current 3D map point according to the data of at least one reference 3D map point to obtain the residual data of the current 3D map point includes: acquiring a predicted value of the current 3D map point according to the data of the at least one reference 3D map point; and subtracting the obtained predicted value of the current 3D map point from the data of the current 3D map point to obtain residual data of the current 3D map point. The method for obtaining the predicted value may include at least one of: extremum, average, and weighted sum of data of at least one reference 3D map point, etc.
In one possible implementation manner, the data includes a 3D map point descriptor and a 3D map point spatial position, and predicting the data of the current 3D map point according to the data of at least one reference 3D map point to obtain residual data of the current 3D map point includes: predicting the 3D map point descriptor of the current 3D map point according to the at least one 3D map point descriptor of the reference 3D map point to obtain residual data of the 3D map point descriptor of the current 3D map point; and predicting the 3D map point space position of the current 3D map point according to the 3D map point space position of the at least one reference 3D map point to obtain residual data of the 3D map point space position of the current 3D map point. Firstly, obtaining a predicted value of a 3D map point descriptor of a current 3D map point based on the 3D map point descriptor of the at least one reference 3D map point, and then subtracting the predicted value of the 3D map point descriptor of the current 3D map point from the predicted value of the 3D map point descriptor of the current 3D map point to obtain residual data of the 3D map point descriptor of the current 3D map point; the method for acquiring the residual data of the 3D map point spatial position of the current 3D map point is similar to the method for acquiring the residual data of the 3D map point descriptor, that is, the predicted value is obtained first, and then the actual value and the predicted value are subtracted to obtain the residual value.
In one possible implementation, the code stream further includes indication information of the at least one reference 3D map point. For example, the indication information of the reference 3D map point may include an identification of the reference 3D map point, an order of the reference 3D map point among the plurality of 3D map points, an order relationship or topology relationship of the reference 3D map point and the current 3D map point, and the like.
In a second aspect, the present application provides a decoding method of a 3D map, including: decoding the code stream to obtain residual data of a current 3D map point, wherein the current 3D map point belongs to a 3D map, and the 3D map comprises a plurality of 3D map points; acquiring data of the current 3D map point according to the data of at least one reference 3D map point and residual data of the current 3D map point, wherein the data at least comprises a 3D map point descriptor, the plurality of 3D map points comprise the at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point decoded before the current 3D map point is decoded.
In the decoding method, the received code stream is the code stream of the residual data of the current 3D map point, the data volume of the residual data is smaller, the bandwidth occupancy rate of the 3D map when being transmitted to the decoding device can be reduced, and the transmission efficiency of the 3D map can be improved.
In one possible implementation manner, the obtaining the data of the current 3D map point according to the data of at least one reference 3D map point and the residual data of the current 3D map point includes: acquiring a predicted value of the current 3D map point according to the data of the at least one reference 3D map point; and adding the residual data of the current 3D map point and the obtained predicted value of the current 3D map point to obtain the data of the current 3D map point.
In one possible implementation, the data further includes a 3D map point spatial location.
In one possible implementation, the method further includes: the at least one reference 3D map point is determined. The at least one reference 3D map point determined by the decoding means needs to coincide with the at least one reference 3D map point determined by the encoding means.
In one possible implementation, when the plurality of 3D map points are arranged in a sequential manner, the determining the at least one reference 3D map point includes: a first 3D map point is determined as the at least one reference 3D map point, the first 3D map point being a last decoded 3D map point of the current 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a sequential manner, the determining the at least one reference 3D map point includes: determining a first 3D map point and a second 3D map point as the at least one reference 3D map point, the first 3D map point being a last decoded 3D map point of the current 3D map point, the second 3D map point being a last decoded 3D map point of the first 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a topology tree, the determining the at least one reference 3D map point includes: a third 3D map point is determined as the at least one reference 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node position of the current 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a topology tree, the determining the at least one reference 3D map point includes: determining a third 3D map point and a fourth 3D map point as the at least one reference 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node position of the current 3D map point, the fourth 3D map point being a 3D map point located in the topology tree at a parent node position of the third 3D map point.
In one possible implementation manner, the obtaining the data of the current 3D map point according to the data of at least one reference 3D map point and the residual data of the current 3D map point includes: acquiring a 3D map point descriptor of the current 3D map point according to residual data of a 3D map point descriptor of the current 3D map point and the at least one 3D map point descriptor of the reference 3D map point; and acquiring the 3D map point space position of the current 3D map point according to the residual data of the 3D map point space position of the current 3D map point and the 3D map point space position of the at least one reference 3D map point.
In a third aspect, the present application provides an encoding apparatus for a 3D map, including: the prediction module is used for acquiring a plurality of 3D map points contained in the 3D map, wherein the 3D map points have cross-modal correlation, namely the sequence of any 3D map point in the 3D map points is associated with a plurality of attributes of the 3D map points, and the attributes of the 3D map points at least comprise 3D map point descriptors; a prediction module, configured to predict data of a current 3D map point according to data of at least one reference 3D map point to obtain residual data of the current 3D map point, where the data includes at least the plurality of attributes, the plurality of 3D map points include the current 3D map point and the at least one reference 3D map point, and each of the at least one reference 3D map point is a 3D map point encoded before encoding the current 3D map point; and the packaging module is used for encoding the residual data of the current 3D map point to obtain a code stream.
The residual data obtained by the encoding device has smaller data quantity, and the storage space of the encoding device for storing the 3D map can be reduced when the encoding device stores the code stream; the encoding device can reduce the bandwidth occupancy rate when transmitting the 3D map and improve the transmission efficiency of the 3D map when transmitting the code stream.
In one possible implementation, the plurality of attributes of the 3D map points further includes a 3D map point spatial location.
In a possible implementation, the prediction module is further configured to sort the plurality of 3D map points according to the plurality of attributes of the plurality of 3D map points.
In one possible implementation, the plurality of 3D map points are arranged in a sequence, and each element in the sequence corresponds to one 3D map point in the plurality of 3D map points; in the sequence, the plurality of attributes between two adjacent 3D map points are associated.
In one possible implementation, the plurality of 3D map points are arranged in a topology tree, and each node in the topology tree corresponds to one 3D map point of the plurality of 3D map points; in the topology tree, the plurality of attributes are associated between a 3D map point located at a parent node position and a 3D map point located at a child node position.
In one possible implementation, the prediction module is further configured to determine the at least one reference 3D map point.
In a possible implementation manner, when the plurality of 3D map points are arranged in a sequence, the prediction module is specifically configured to determine a first 3D map point as the at least one reference 3D map point, where the first 3D map point is a 3D map point arranged in the sequence before the current 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a sequence, the prediction module is specifically configured to determine, as the at least one reference 3D map point, a first 3D map point and a second 3D map point, where the first 3D map point is a 3D map point arranged in the sequence before the current 3D map point, and the second 3D map point is a 3D map point arranged in the sequence before the first 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a topology tree, the prediction module is specifically configured to determine a third 3D map point as the at least one reference 3D map point, where the third 3D map point is a 3D map point located at a parent node position of the current 3D map point in the topology tree.
In one possible implementation manner, when the plurality of 3D map points are arranged according to a topology tree, the prediction module is specifically configured to determine, as the at least one reference 3D map point, a third 3D map point and a fourth 3D map point, where the third 3D map point is a 3D map point located at a parent node position of the current 3D map point in the topology tree, and the fourth 3D map point is a 3D map point located at a parent node position of the third 3D map point in the topology tree.
In a possible implementation manner, the prediction module is specifically configured to obtain a predicted value of the current 3D map point according to the data of the at least one reference 3D map point; and subtracting the obtained predicted value of the current 3D map point from the data of the current 3D map point to obtain residual data of the current 3D map point.
In a possible implementation manner, the prediction module is specifically configured to predict, according to the at least one 3D map point descriptor referring to the 3D map point, the 3D map point descriptor of the current 3D map point to obtain residual data of the 3D map point descriptor of the current 3D map point; and predicting the 3D map point space position of the current 3D map point according to the 3D map point space position of the at least one reference 3D map point to obtain residual data of the 3D map point space position of the current 3D map point.
In one possible implementation, the code stream further includes indication information of the at least one reference 3D map point.
In a fourth aspect, the present application provides a decoding apparatus for a 3D map, including: the system comprises a decapsulation module, a decoding module and a decoding module, wherein the decapsulation module is used for decoding a code stream to obtain residual data of a current 3D map point, the current 3D map point belongs to a 3D map, and the 3D map comprises a plurality of 3D map points; a prediction module, configured to obtain data of at least one reference 3D map point according to the data of the at least one reference 3D map point and residual data of the current 3D map point, where the data includes at least a 3D map point descriptor, and the plurality of 3D map points includes the at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point decoded before decoding the current 3D map point.
The code stream received by the decoding device is the code stream of the residual data of the current 3D map point, the data volume of the residual data is smaller, the bandwidth occupancy rate of the 3D map when being transmitted to the decoding device can be reduced, and the transmission efficiency of the 3D map can be improved.
In one possible implementation, the data further includes a 3D map point spatial location.
In a possible implementation manner, the prediction module is specifically configured to obtain a predicted value of the current 3D map point according to the data of the at least one reference 3D map point; and adding the residual data of the current 3D map point and the obtained predicted value of the current 3D map point to obtain the data of the current 3D map point.
In one possible implementation, the prediction module is further configured to determine the at least one reference 3D map point.
In a possible implementation, when the plurality of 3D map points are arranged in a sequential manner, the prediction module is specifically configured to determine a first 3D map point as the at least one reference 3D map point, where the first 3D map point is a last decoded 3D map point of the current 3D map point.
In a possible implementation manner, when the plurality of 3D map points are arranged in a sequential manner, the prediction module is specifically configured to determine, as the at least one reference 3D map point, a first 3D map point and a second 3D map point, where the first 3D map point is a last decoded 3D map point of the current 3D map point, and the second 3D map point is a last decoded 3D map point of the first 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a topology tree, the prediction module is specifically configured to determine a third 3D map point as the at least one reference 3D map point, where the third 3D map point is a 3D map point located at a parent node position of the current 3D map point in the topology tree.
In one possible implementation manner, when the plurality of 3D map points are arranged according to a topology tree, the prediction module is specifically configured to determine, as the at least one reference 3D map point, a third 3D map point and a fourth 3D map point, where the third 3D map point is a 3D map point located at a parent node position of the current 3D map point in the topology tree, and the fourth 3D map point is a 3D map point located at a parent node position of the third 3D map point in the topology tree.
In a possible implementation manner, the prediction module is specifically configured to obtain a 3D map point descriptor of the current 3D map point according to residual data of a 3D map point descriptor of the current 3D map point and the 3D map point descriptor of the at least one reference 3D map point; and acquiring the 3D map point space position of the current 3D map point according to the residual data of the 3D map point space position of the current 3D map point and the 3D map point space position of the at least one reference 3D map point.
In a fifth aspect, the present application provides an encoding apparatus of a 3D map, including: one or more processors; a memory for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of the first aspects described above.
In a sixth aspect, the present application provides a decoding apparatus of a 3D map, including: one or more processors; a memory for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of the second aspects described above.
In a seventh aspect, the present application provides a computer readable storage medium comprising a computer program which, when executed on a computer, causes the computer to perform the method of any one of the first to second aspects.
In an eighth aspect, the present application provides a computer program product comprising computer program code which, when run on a computer, causes the computer to perform the method of any one of the first to second aspects.
In a ninth aspect, the present application provides a 3D map coding bitstream, where the 3D map coding bitstream includes residual data of a current 3D map point, where the residual data of the current 3D map point is obtained by predicting data of the current 3D map point using data of at least one reference 3D map point, where the current 3D map point and the at least one reference 3D map point belong to a 3D map respectively, and each of the at least one reference 3D map point is a 3D map point encoded before encoding the current 3D map point; the 3D map comprises a plurality of 3D map points, the plurality of 3D map points have cross-modal correlation, which means that the order of any one 3D map point in the plurality of 3D map points is associated with a plurality of attributes thereof, the plurality of attributes of the 3D map points at least comprise 3D map point descriptors, and the data at least comprise the plurality of attributes.
In one possible implementation, the at least one reference 3D map point is taken from a plurality of ordered 3D map points.
In one possible implementation, the plurality of 3D map points are arranged in a sequence in which the plurality of attributes between two adjacent 3D map points are associated.
In one possible implementation, the at least one reference 3D map point includes a first 3D map point, the first 3D map point being a 3D map point arranged in the sequence before the current 3D map point.
In one possible implementation, the at least one reference 3D map point includes a first 3D map point and a second 3D map point, the first 3D map point being a 3D map point arranged in the sequence before the current 3D map point, the second 3D map point being a 3D map point arranged in the sequence before the first 3D map point.
In one possible implementation, the plurality of 3D map points are arranged in a topology tree in which the plurality of attributes between the 3D map points located at parent node positions and the 3D map points located at child node positions are associated.
In one possible implementation, the at least one reference 3D map point includes a third 3D map point, the third 3D map point being a 3D map point located at a parent node position of the current 3D map point in the topology tree.
In one possible implementation, the reference 3D map points include a third 3D map point and a fourth 3D map point, the third 3D map point being a 3D map point located at a parent node position of the current 3D map point in the topology tree, the fourth 3D map point being a 3D map point located at a parent node position of the third 3D map point in the topology tree.
Drawings
Fig. 1 is a schematic diagram of an application architecture according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a server 30 according to an embodiment of the present application;
fig. 4a is a schematic diagram of an application scenario provided in an embodiment of the present application;
fig. 4b is a schematic diagram of an application scenario provided in an embodiment of the present application;
fig. 4c is a schematic diagram of an application scenario provided in an embodiment of the present application;
fig. 4d is a schematic diagram of an application scenario provided in an embodiment of the present application;
fig. 4e is a schematic diagram of an application scenario provided in an embodiment of the present application;
fig. 4f is a schematic diagram of an application scenario provided in an embodiment of the present application;
FIG. 4g is a schematic diagram of a user interface displayed by an electronic device according to an embodiment of the present application;
FIG. 5 is a flowchart of a process 500 of a method for encoding a 3D map according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a ranking result provided in an embodiment of the present application;
FIG. 7 is a schematic diagram of a ranking result according to an embodiment of the present application;
fig. 8 is a flowchart of a process 800 of a decoding method of a 3D map according to an embodiment of the present application;
Fig. 9 is a block diagram of an encoding apparatus 90 for a 3D map according to an embodiment of the present application;
fig. 10 is a block diagram of a decoding apparatus 100 for a 3D map according to an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is apparent that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The terms "first," "second," and the like in the description and in the claims and drawings are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such as a series of steps or elements. The method, system, article, or apparatus is not necessarily limited to those explicitly listed but may include other steps or elements not explicitly listed or inherent to such process, method, article, or apparatus.
It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
Fig. 1 is a schematic diagram of an application architecture provided in an embodiment of the present application, as shown in fig. 1, where the application architecture includes a plurality of electronic devices and a server, where the plurality of electronic devices may include a first electronic device and one or more second electronic devices (two second electronic devices are taken as an example in fig. 1) that are a plurality of electronic devices other than the first electronic device. The plurality of electronic devices and the server, and the plurality of electronic devices may communicate with each other, for example, for any device in the application architecture, the communication may be performed with other devices through wireless fidelity (WiFi) communication, bluetooth communication, or cellular 2/3/4/5generation (2/3/4/5 generation, 2G/3G/4G/5G) communication, and it should be understood that other communication manners, including future communication manners, may also be adopted between the server and the electronic devices, which are not limited in particular. It should be noted that, the "one or more second electronic devices" in the embodiments of the present application is only used to indicate other electronic devices except the first electronic device, but is not limited to whether the types of the plurality of electronic devices are the same.
The electronic device may be various types of devices configured with a camera and a display assembly, for example, the electronic device may be a terminal device such as a mobile phone, a tablet computer, a notebook computer, a video recorder (in fig. 1, the electronic device is a mobile phone, for example), the electronic device may also be a device for virtual scene interaction, including VR glasses, AR devices, MR interaction devices, etc., the electronic device may also be a wearable electronic device such as a smart watch, a smart bracelet, etc., and the electronic device may also be a device mounted on a vehicle, an unmanned aerial vehicle, an industrial robot, etc. The specific form of the electronic device is not specifically limited in the embodiments of the present application.
Further, the electronic device may also be referred to as a User Equipment (UE), a subscriber station, a mobile unit, a subscriber unit, a wireless unit, a remote unit, a mobile device, a wireless communication device, a remote device, a mobile subscriber station, a terminal device, an access terminal, a mobile terminal, a wireless terminal, a smart terminal, a remote terminal, a handset, a user agent, a mobile client, a client, or some other suitable terminology.
The server may be one or more physical servers (one physical server is taken as an example in fig. 1), may be a computer cluster, may be a virtual machine or a cloud server of a cloud computing scene, and the like.
In this embodiment of the present application, the electronic device may install a virtual scenario application program (application) such as a VR application, an AR application, or an MR application, and may run the VR application, the AR application, or the MR application based on an operation (e.g., clicking, touching, sliding, dithering, voice control, etc.) of the user. The electronic device may collect visual information of any object in the environment through the sensor, and then display a virtual object on the display component according to the collected visual information, where the virtual object may be a virtual object in a VR scene, an AR scene, or an MR scene (i.e., an object in the virtual environment).
In the embodiment of the application, the electronic device may install a navigation, detection and control application program, and run a corresponding application based on user control or a preset program. The electronic equipment can carry out applications such as path planning, object detection, carrier control and the like based on the pose and other state information of the electronic equipment in the current environment.
The visual information related to the embodiment of the application includes, but is not limited to, image video acquired by a camera (without depth information), image video with depth information acquired by a depth sensor (depth sensor), data acquired by a laser RaDAR (LiDAR), and data acquired by a millimeter wave RaDAR (RaDAR).
In the embodiment of the present application, the virtual scene application program in the electronic device may be an application program built in the electronic device itself, or may be an application program provided by a third party service provider installed by the user, which is not limited specifically.
In the embodiment of the application, the electronic device may be further configured with an instant positioning and map building (simultaneous localization and mapping, SLAM) system, and the SLAM system can create a map in a completely unknown environment and use the map to perform positioning, pose (position and pose) determination, navigation, and the like. In this embodiment, the map created by the SLAM system is referred to as a SLAM map, where the SLAM map may be understood as a map drawn by the SLAM system according to environmental information acquired by an acquisition device, where the acquisition device may include a visual information acquisition device and an inertial measurement unit (inertial measurement unit, IMU) in an electronic device, where the visual information acquisition device may include, for example, a camera, a depth camera, a laser radar, a millimeter wave radar, and the IMU may include, for example, a sensor such as a gyroscope, an accelerometer, and the like.
SLAM maps are also referred to as 3D maps in embodiments of the present application. It should be noted that the 3D map includes, but is not limited to, a SLAM map, and may also include a three-dimensional map created by using other technologies, which is not specifically limited in the embodiments of the present application.
In one possible implementation, the 3D map may include a plurality of 3D map points, and the data of the 3D map may include data of the plurality of 3D map points, respectively. The 3D map points are points of interest or having significant features in the environment.
One possible way to obtain the 3D map points is to use multiple devices such as laser radar, aerial photography (oblique photography) of unmanned aerial vehicle view angle, high definition panoramic camera, high definition industrial camera, etc. to shoot, and extract from the data shot by the above devices by means of features (features from accelerated segment test, FAST) and rotated binary robust independent basic feature (binary robust independent elementary features, BRIEF) descriptors (oriented FAST and rotated BRIEF, ORB), scale invariant feature transform (scale invariant feature transform, SIFT), accelerated robust feature (speed-up robust feature, SURF), BRIEF, binary robust invariant extensible keypoints (binary robust invariant scalable keypoints, BRISK), FAST retinal keypoints (FAST retina keypoint, frak), repeatable and reliable detection and descriptors (repeatable and reliable detector and descriptor, R2D 2), etc.
The data of the 3D map points may include:
(1) 3D map point descriptor
The 3D map point descriptor is a vector for representing the local features of the 3D map point. In the visual localization algorithm, 3D map point descriptors are used to make matches between 3D map points. One possible approach is to calculate the distance between two 3D map point descriptors (which may be euclidean distance, inner product distance, hamming distance, etc.), and when the distance is less than a threshold, consider two 3D map point matches.
(2) 3D map point spatial location
The spatial position of the 3D map point may be represented by X, Y, Z on a three-dimensional spatial axis, may be represented by longitude and latitude, altitude, or may be represented by polar coordinates, etc., and the method for representing the spatial position of the 3D map point in the embodiment of the present application is not specifically limited. The 3D map point spatial position may be an absolute position of a 3D map point or a relative position of a 3D map point, for example, a center position of the entire area is taken as an origin, and all the 3D map point spatial positions are offset positions with respect to the spatial position of the origin.
In this embodiment of the present application, a number may be allocated to each 3D map point and written into data of the 3D map, or the number of the 3D map point may be implicitly represented by using a storage sequence of a plurality of 3D map points in the memory. It should be noted that, the order of the 3D map points included in the 3D map is not actually significant, and thus the foregoing number may be considered as an identifier for identifying the 3D map points to distinguish the 3D map points, but the number is not limited to the order of the 3D map points, for example, the 3D map includes 3D map points, which are numbered 1, 2 and 3, and the processing performed on the 3D map points may be performed in the order of 1, 2 and 3, may be performed in the order of 3, 2 and 1, may be performed in the order of 2, 1 and 3, and the like.
In a possible implementation manner, the data of the 3D map further includes a plurality of region descriptors, any region descriptor in the plurality of region descriptors is used to describe features of some or all 3D map points in the plurality of 3D map points, that is, for any region descriptor in the plurality of region descriptors, the region descriptor may be used to describe features of some or all 3D map points in the plurality of 3D map points, so that the region descriptor and the 3D map points are in a one-to-many relationship. And the feature of each 3D map point of the plurality of 3D map points may be described by some or all of the plurality of region descriptors such that the 3D map points and the region descriptors are in one-to-many relationship. It follows that there is a many-to-many relationship between the plurality of region descriptors and the plurality of 3D map points. The generation method of the region descriptor includes, but is not limited to, traditional methods such as word bags (bag of words, BOW), local aggregate descriptor vectors (vector of aggragate locally descriptor, VLAD) and novel methods based on NetVLAD and artificial intelligence (artificial intelligence, AI). Similarly, the plurality of region descriptors may also be identified by a number to distinguish the plurality of region descriptors, but the number is not limited to the order of precedence among the plurality of region descriptors.
In one possible implementation, the data of the 3D map further includes a correspondence between 3D map points and region descriptors, in which the correspondence explicitly describes which 3D map points any one region descriptor corresponds to, and which region descriptors any 3D map point corresponds to.
Alternatively, the correspondence relationship may be explicitly described by a correspondence table between the numbers of the region descriptors and the numbers of the 3D map points, for example, the 3D map includes 3 region descriptors, the numbers are T1 to T3,6 3D map points, the numbers are P1 to P6, and the correspondence table is shown in table 1 or table 2.
TABLE 1
It should be noted that, table 1 is an example of a correspondence table between the numbers of the region descriptors and the numbers of the 3D map points, and the correspondence table may be presented in other formats or modes, which is not specifically limited in this application.
Alternatively, the above correspondence may be implicitly described by using the storage locations of the region descriptors and the 3D map points, for example, T1 is stored in the memory first, then the data of P1, P2 and P3 are stored, T2 is stored next, the data of P2 and P3 are stored next, and finally T3 is stored next, and then the data of P3, P4, P5 and P6 are stored.
Fig. 2 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application, where, as shown in fig. 2, the electronic device 20 may be at least one of the first electronic device and the one or more second electronic devices in the embodiment shown in fig. 1. It should be understood that the configuration shown in fig. 2 does not constitute a particular limitation on the electronic device 20. In other embodiments of the present application, electronic device 20 may include more or fewer components than the configuration shown in FIG. 2, or certain components may be combined, certain components may be separated, or different arrangements of components may be provided. The various components shown in fig. 2 may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.
The electronic device 20 may include: chip 21, memory 22 (one or more computer-readable storage media), user interface 23, display assembly 24, camera 25, sensor 26, positioning module 27 for device positioning, and transceiver 28 for communication. These components may communicate with each other via one or more buses 29.
The chip 21 may be integrated to include: one or more processors 211, a clock module 212, and a power management module 213. The clock module 212 integrated in the chip 21 is mainly used for providing the processor 211 with a timer required for data transmission and timing control, and the timer can realize a clock function for data transmission and timing control. The processor 211 may perform operations according to the instruction operation code and the timing signal, generate operation control signals, and complete instruction fetching and instruction execution control. The power management module 213 integrated in the chip 21 is mainly used to provide a stable, high-precision voltage to the chip 21 and other components of the electronic device 20.
The processor 211 may also be referred to as a central processor (central processing unit, CPU), and the processor 211 may specifically include one or more processing units, for example, the processor 211 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.
In one possible implementation, processor 211 may include one or more interfaces. The interfaces may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous receiver transmitter (universal asynchronous receiver/transmitter, UART) interface, a mobile industry processor interface (mobile industry processor interface, MIPI), a general-purpose input/output (GPIO) interface, a subscriber identity module (subscriber identity module, SIM) interface, and/or a universal serial bus (universal serial bus, USB) interface, among others.
Memory 22 may be coupled to processor 211 via bus 29 or may be coupled to processor 311 for storing various software programs and/or sets of instructions. Memory 22 may include high-speed random access memory (e.g., cache memory) or may include non-volatile memory, such as one or more disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 22 may store an operating system such as an embedded operating system, for example, android, apple mobile platform (IOS), microsoft Windows operating system (Windows), or UNIX-like operating system (Linux). The memory 22 may also store data such as image data, point cloud data, data of a 3D map, pose data, coordinate system conversion information, map update information, and the like. The memory 22 may also store computer-executable program code including instructions, for example, communication program instructions, related program instructions of a SLAM system, and the like. The memory 22 may also store one or more applications, such as virtual scene applications, e.g., AR/VR/MR, map class applications, image management class applications, navigation and control class applications, and the like. The memory 22 may also store a user interface program that may graphically interface the content of the application program, e.g., virtual objects in a virtual scene such as AR/VR/MR, vividly displayed and presented via the display component 24, and enable receiving user control operations of the application program via input controls such as menus, dialog boxes, and buttons.
The user interface 23 may be, for example, a touch panel that can detect an operation instruction by a user thereon, and the user interface 23 may be, for example, a keypad, physical keys, a mouse, or the like.
The electronic device 20 may include one or more display components 24. The electronic device 20 may collectively implement a display function through a display component 24, a Graphics Processor (GPU) and an Application Processor (AP) in the chip 21, or the like. The GPU is a microprocessor that implements image processing, which connects the display component 24 and the application processor, and performs mathematical and geometric calculations for graphics rendering. The display component 24 may display interface content output by the electronic device 20, for example, display images, videos, etc. in virtual scenes such as AR/VR/MR, etc., where the interface content may include an interface of an running application program, a system level menu, etc., and may specifically be composed of the following interface elements: input interface elements such as buttons (Button), text input boxes (Text), slider bars (Scroll Bar), menus (Menu), and the like; output interface elements such as windows (windows), labels (Label), images, videos, animations, etc.
The display assembly 24 may be a display panel, a lens (e.g., VR glasses), a projection screen, or the like. The display panel may also be referred to as a display screen, and may be, for example, a touch screen, a flexible screen, a curved screen, etc., or may be other optical components. It should be understood that the display screen of the electronic device in the embodiments of the present application may be a touch screen, a flexible screen, a curved screen, or other forms of screens, that is, the display screen of the electronic device has a function of displaying images, and the specific material and shape of the display screen are not specifically limited.
For example, when the display assembly 24 includes a display panel, the display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix or active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (flex), a mini, micro-OLED, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like. Further, in one possible implementation, a touch panel in the user interface 23 and a display panel in the display assembly 24 may be coupled together, e.g., the touch panel may be disposed below the display panel, the touch panel being configured to detect touch pressure acting on the display panel when a user inputs a touch operation (e.g., clicking, sliding, touching, etc.) through the display panel, the display panel being configured to display content.
The camera 25 may be a monocular camera or a binocular camera or a depth camera for capturing/video of the environment to obtain images/video images. The image/video image captured by the camera 25 may be used, for example, as an input data for a SLAM system or may be displayed via the display assembly 24.
In one possible implementation, the camera 25 may also be considered as a sensor. The image acquired by the camera 25 may be in IMG format or other format types, which is not specifically limited in the embodiments of the present application.
The sensor 26 may be used to collect data related to a change in state (e.g., rotation, oscillation, movement, jitter, etc.) of the electronic device 20, and the data collected by the sensor 26 may be used as one of the input data for the SLAM system. The sensor 26 may include one or more sensors, such as an inertial measurement unit (inertial measurement unit, IMU), a time of flight (TOF) sensor, or the like. The IMU may include sensors such as a gyroscope for measuring angular velocity of the electronic device during movement, and an accelerometer for measuring acceleration of the electronic device during movement. The TOF sensor may include a light emitter for emitting light, e.g., laser light, infrared light, radar waves, etc., and a light receiver for detecting reflected light, e.g., reflected laser light, infrared light, radar waves, etc.
It should be noted that the sensor 26 may also include a plurality of other sensors, such as an inertial sensor, a barometer, a magnetometer, a wheel speed meter, etc., which are not particularly limited in this embodiment of the present application.
The positioning module 27 is configured to implement physical positioning of the electronic device 20, for example, to obtain an initial position of the electronic device 20. The positioning module 27 may comprise one or more of a WiFi positioning module, a bluetooth positioning module, a base station positioning module, a satellite positioning module. A global navigation satellite system (global navigation satellite system, GNSS) may be provided in the satellite positioning module to aid positioning, GNSS not limited to the beidou system, the global positioning system (global positioning system, GPS) system, the GLONASS (global navigation satellite system, GLONASS) system, and the Galileo satellite navigation system (Galileo) system.
The transceiver 28 is used to enable communication between the electronic device 20 and other devices (e.g., servers, other electronic devices, etc.). The transceiver 28 integrates a transmitter and a receiver for transmitting and receiving radio frequency signals, respectively. In particular implementations, transceiver 28 includes, but is not limited to: an antenna system, a Radio Frequency (RF) transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a coder-decoder (CODEC) chip, a subscriber identity card (subscriber identification module, SIM) card, a storage medium, and the like. In one possible implementation, the transceiver 28 may also be implemented on a separate chip. The transceiver 28 supports at least one data network communication of 2G/3G/4G/5G, etc., and/or supports at least one of the following manners of near wireless communication: bluetooth (BT) communication, wireless fidelity (WiFi) communication, near field communication (near field communication, NFC), infrared (IR) wireless communication, ultra Wide Band (UWB) communication, zigBee (ZigBee) communication.
In the present embodiment, the processor 211 executes various functional applications and data processing of the electronic device 20 by running program codes stored in the memory 22.
Fig. 3 is a schematic structural diagram of a server 30 according to an embodiment of the present application, and as shown in fig. 3, the server 30 may be a server in the embodiment shown in fig. 1. The server 30 includes a processor 301, memory 302 (one or more computer-readable storage media), and a transceiver 303. These components may communicate between themselves via one or more buses 304.
Processor 301 may be one or more CPUs, which may be single-core or multi-core in the case where processor 301 is a CPU.
The memory 302 may be coupled to the processor 301 via a bus 304 or may be coupled to the processor 301 for storing various program codes and/or sets of instructions, as well as data (e.g., map data, pose data, etc.). In particular implementations, memory 302 includes, but is not limited to, random access Memory (Random Access Memory, RAM), read-Only Memory (ROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), or portable Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM), among others.
The transceiver 303 mainly integrates a receiver for receiving data (e.g., requests, images, etc.) transmitted by the electronic device and a transmitter for transmitting data (e.g., map data, pose data, etc.) to the electronic device.
It should be understood that the server 30 shown in fig. 3 is only one example provided in the embodiments of the present application, and the server 30 may also have more components than illustrated, and the embodiments of the present application are not limited in detail in this regard.
In the present embodiment, the processor 301 executes various functional applications and data processing of the server 30 by running program codes stored in the memory 302.
The term "coupled" as used in embodiments of the present application means directly connected or connected through one or more intervening components or circuits.
Fig. 4a is a schematic diagram of an application scenario provided in an embodiment of the present application, where, as shown in fig. 4a, the application scenario is that an electronic device collects visual information through a sensor, and determines a current pose of the electronic device by combining the visual information and a 3D map from a server.
The 3D map is provided by a server, namely the server creates a 3D map, then compresses the 3D map, and transmits the compressed data of the 3D map to the electronic equipment; and the electronic equipment receives the compressed data of the 3D map, then performs decompression processing to obtain the reconstructed data of the 3D map, and determines the current pose of the electronic equipment by combining the acquired visual information and the 3D map. The pose, i.e. the position and orientation information of the electronic device, may be an absolute pose in the world coordinate system or a relative pose with respect to a point in the environment.
In the embodiment of the application, the server can create the 3D map in advance, and store the 3D map locally after compressing, so that the storage space can be saved. In addition, the server may also transmit the compressed data of the 3D map to other devices, such as cloud storage.
1. The server creates a 3D map and compresses the compressed data to obtain the 3D map and stores the compressed data locally.
The server compresses the 3D map, so that local storage space can be saved.
2. The electronic device sends a map downloading request to the server, and the triggering modes of the map downloading request are two:
(1) The user opens a map application program installed on the electronic equipment, the application program uploads the position information obtained based on GPS positioning or WiFi positioning to a corresponding server, and the uploading operation can trigger a map downloading request. Because the uploaded content includes the location information, the server may perform a preliminary estimation based on the location information, and transmit compressed data of the 3D map of the area to which the location point indicated by the location information belongs to the electronic device. The range of the area to which the location point indicated by the location information belongs may be set in advance, for example, the area may be administrative areas (including county, city, country, administrative area, etc.) of each level where the location point is located, or may be a circumference area with the location point as the center and the set distance as the radius.
(2) The user opens a map-like application installed on the electronic device in which an area is actively entered or selected, for example, the user actively enters "xx business center" or selects "a street" from a list of "a street, B street, C street". The foregoing operation by the user may trigger a map download request. Whether the user input or the selection includes a geographic location, the server transmits compressed data of a 3D map of the geographic location to the electronic device.
It should be understood that, in addition to the two ways described above, the map downloading request may be triggered in other ways in the embodiments of the present application, for example, the electronic device automatically detects whether the condition of downloading the 3D map or starting to download the 3D map is met, or the electronic device starts to download the 3D map by detecting a change in ambient light or detecting a change in the environment, so as to request to download the 3D map within a region to the server, where the size of the region is not specifically limited.
3. The server transmits compressed data of the 3D map to the electronic device.
4. The electronic device collects visual information.
It should be noted that, the steps 3 and 4 are independent from each other, and are not limited in sequence.
5. The electronic device decompresses the compressed data of the 3D map to obtain reconstructed data of the 3D map.
6. And the electronic equipment performs positioning in the 3D map according to the visual information to obtain the pose corresponding to the visual information.
After receiving the compressed data of the 3D map, the electronic device does not need to decompress immediately, and only needs to decompress before positioning based on visual information to obtain the reconstructed data of the 3D map. For example, the user may download compressed data of a 3D map of an area range in advance in a manner of downloading an "offline map", and decompress the compressed data of the 3D map when positioning is required.
Fig. 4b is a schematic diagram of an application scenario provided in an embodiment of the present application, where, as shown in fig. 4b, the application scenario is that an electronic device collects visual information through a sensor, and a server determines a current pose of the electronic device by combining the visual information from the electronic device and a 3D map.
The 3D map is provided by a server, i.e. the server creates a 3D map, then compresses the 3D map, storing the compressed data of the 3D map locally. And when visual information from the electronic equipment is received, the server performs decompression processing to obtain reconstruction data of the 3D map, and the current pose of the electronic equipment is determined by combining the visual information and the 3D map.
1. The server creates a 3D map and compresses the compressed data to obtain the 3D map and stores the compressed data locally.
2. The electronic device collects visual information.
3. The electronic device sends visual information to the server.
4. The server decompresses the compressed data of the 3D map to obtain reconstructed data of the 3D map.
It should be appreciated that the compression of the 3D map by the server is done to save storage space.
5. And the server performs positioning in the 3D map according to the visual information to obtain the pose corresponding to the visual information.
6. The server sends the pose to the electronic device.
Fig. 4c is a schematic diagram of an application scenario provided in an embodiment of the present application, where, as shown in fig. 4c, the application scenario is that an electronic device collects visual information through a sensor, and determines a current pose of the electronic device by combining the visual information and a 3D map.
The 3D map is provided by the electronic device, i.e. the electronic device creates a 3D map, and then performs a compression process on the 3D map, storing compressed data of the 3D map locally. And when the acquired visual information is acquired, the electronic equipment performs decompression processing to obtain reconstruction data of the 3D map, and the current pose of the electronic equipment is determined by combining the acquired visual information and the 3D map.
1. The electronic device creates a 3D map and compresses the compressed data to obtain the 3D map and stores the compressed data locally.
It should be appreciated that the electronic device performs the compression process on the 3D map in order to save storage space.
2. The electronic equipment acquires visual information through the sensor.
3. The electronic device decompresses the compressed data of the 3D map to obtain reconstructed data of the 3D map.
4. And the electronic equipment performs positioning in the 3D map according to the visual information to obtain the pose corresponding to the visual information.
Fig. 4D is a schematic diagram of an application scenario provided in the embodiment of the present application, as shown in fig. 4b, where the application scenario is that the second electronic device collects visual information through a sensor, and determines the current pose of the second electronic device by combining the visual information and a 3D map from a server.
The 3D map is created by the first electronic device, namely the first electronic device creates a 3D map, the 3D map is compressed, then compressed data of the 3D map is sent to the server, the server sends the compressed data of the 3D map to the second electronic device, the second electronic device performs decompression processing to obtain reconstructed data of the 3D map, and the current pose of the second electronic device is determined by combining the acquired visual information and the 3D map.
In this embodiment of the present application, the first electronic device may create a 3D map in advance, and transmit the 3D map to the server after performing compression processing, so that the transmission bandwidth may be reduced.
1. The first electronic device creates a 3D map and compresses the 3D map to obtain compressed data of the 3D map.
2. The first electronic device sends compressed data of the 3D map to the server.
The first electronic equipment compresses the 3D map and transmits the 3D map, so that the transmission bandwidth can be reduced, and the transmission efficiency can be improved.
3. The second electronic device sends a map download request to the server.
The second electronic device may also send a map download request based on the triggering scheme shown in fig. 4 a.
4. The server transmits the compressed data of the 3D map to the second electronic device.
5. And the second electronic equipment decompresses the compressed data of the 3D map to obtain the reconstructed data of the 3D map.
6. The second electronic device acquires visual information through the sensor.
7. And the second electronic equipment is positioned in the 3D map according to the visual information so as to obtain the pose corresponding to the visual information.
Fig. 4e is a schematic diagram of an application scenario provided in the embodiment of the present application, as shown in fig. 4c, where the application scenario is that the second electronic device collects visual information through the sensor, and the server determines the current pose of the second electronic device by combining the visual information from the second electronic device and the 3D map from the first electronic device.
The 3D map is created by the first electronic device, that is, the first electronic device creates a 3D map, performs compression processing on the 3D map, and then transmits the compressed data of the 3D map to the server. And the server performs decompression processing to obtain reconstruction data of the 3D map, and determines the current pose of the second electronic device by combining the visual information from the second electronic device and the 3D map.
1. The first electronic device creates a 3D map and compresses the 3D map to obtain compressed data of the 3D map.
2. The first electronic device sends compressed data of the 3D map to the server.
3. The second electronic device acquires visual information through the sensor.
4. The second electronic device sends a positioning request to the server, wherein the positioning request carries visual information.
5. The server decompresses the compressed data of the 3D map to obtain reconstructed data of the 3D map.
6. And the server performs positioning in the 3D map according to the visual information to obtain the pose corresponding to the visual information.
7. And the server sends the pose obtained by positioning to the second electronic equipment.
Fig. 4f is a schematic diagram of an application scenario provided in the embodiment of the present application, as shown in fig. 4D, where the application scenario is that a second electronic device collects visual information through a sensor, and determines a current pose of the second electronic device by combining the visual information and a 3D map from the first electronic device.
The 3D map is created by the first electronic device, namely the first electronic device creates a 3D map, the 3D map is compressed, compressed data of the 3D map is sent to the second electronic device, the second electronic device performs decompression processing to obtain reconstructed data of the 3D map, and the current pose of the second electronic device is determined by combining the acquired visual information and the 3D map from the first electronic device.
1. The first electronic device creates a 3D map, compresses the 3D map, and stores compressed data of the 3D map locally.
2. The second electronic device sends a map downloading request to the first electronic device.
3. The first electronic device sends compressed data of the 3D map to the second electronic device.
4. And the second electronic equipment decompresses the compressed data of the 3D map to obtain the reconstructed data of the 3D map.
5. The second electronic device acquires visual information through the sensor.
6. And the second electronic equipment is positioned in the 3D map according to the visual information so as to obtain the pose corresponding to the visual information.
In the embodiment shown in fig. 4 a-4 f, the positioning algorithm used may include:
(1) And extracting the regional cable descriptors to be detected from the visual information, wherein the algorithm used for extracting the regional cable descriptors to be detected is consistent with the algorithm used for extracting the regional descriptors from the 3D map.
(2) And extracting 3D map points to be searched from the visual information, and acquiring the spatial positions of the 3D map points to be searched and 3D map point descriptors to be searched, wherein the algorithm for extracting the 3D map point descriptors to be searched is consistent with the algorithm for extracting the 3D map point descriptors from the 3D map.
(3) And searching among a plurality of region descriptors contained in the data of the 3D map according to the region descriptors to be searched to obtain a plurality of candidate region descriptors.
In this embodiment, the distance between the region descriptor to be searched and each region descriptor in the plurality of region descriptors may be calculated, where the distance may include a hamming distance, a manhattan distance, or an euclidean distance, and then at least one region descriptor that meets a condition (for example, the distance is smaller than a threshold value) is selected, that is, the candidate region descriptor.
(4) And matching the 3D map point descriptors to be searched with the 3D map point descriptors corresponding to the candidate region descriptors respectively, namely calculating the similarity of the 3D map point descriptors to be searched with the 3D map point descriptors corresponding to the candidate region descriptors respectively, and finding out the most similar 3D map points.
(5) And according to the matched 3D map points, calculating by using pose solving algorithms such as perspective n-point camera pose estimation (PnP) and efficient perspective n-point camera pose estimation (efficient perspective-n-point camera pose estimation, EPnP) to obtain the pose of the electronic equipment.
In the embodiments shown in fig. 4a to 4f, the compression processing is performed on the 3D map, and the embodiments of the present application provide a plurality of device frames for performing the compression processing, and the plurality of device frames are described below.
In any application scenario from fig. 4a to fig. 4f, positioning is performed based on the 3D map according to the embodiment of the present application to obtain the current pose of the electronic device, where the pose may be applied to the fields of AR navigation, AR man-machine interaction, assisted driving, automatic driving, and the like. Taking AR navigation based on this pose as an example, fig. 4g is a schematic diagram of a user interface displayed by the electronic device according to the embodiment of the present application, where the electronic device may display, based on the pose, the user interface shown in fig. 4g, and the user interface may include a navigation arrow indication for navigating to the conference room 2, where the navigation arrow indication for navigating to the conference room 2 may be a virtual object acquired from a server or acquired locally based on the pose. Visual information collected by the sensor may also be included on the user interface, for example, a building as shown in fig. 4 g. The user goes to the conference room 2 with reference to the user interface of the electronic device as shown in fig. 4 g.
Fig. 5 is a flowchart of a process 500 of the 3D map encoding method according to the embodiment of the present application, as shown in fig. 5, where the process 500 may be performed by an encoding apparatus, and the encoding apparatus may be applied to a server or an electronic device in the above embodiment, particularly a device that needs to compress and send a 3D map, for example, a server in the embodiment shown in fig. 4a, or a first electronic device in the embodiment shown in fig. 4D to 4 f. Process 500 is described as a series of steps or operations, it being understood that process 500 may be performed in various orders and/or concurrently, and is not limited to the order of execution as depicted in fig. 5. Assuming that the data of the 3D map is compressed in the encoding apparatus to obtain a code stream, which is transmitted by the encoding apparatus, a process 500 including the following steps is performed to process the data of the 3D map that is currently being processed.
Step 501, a plurality of 3D map points included in the 3D map are acquired, the plurality of 3D map points have cross-modal correlation, and the plurality of 3D map points have cross-modal correlation, which means that the order of any one 3D map point in the plurality of 3D map points is associated with a plurality of attributes thereof.
The plurality of attributes of the 3D map point comprise at least a 3D map point descriptor, which as described in the previous embodiments is a vector comprising at least one component.
Optionally, on the basis that the plurality of attributes includes a 3D map point descriptor, the plurality of attributes of the 3D map point may further include a 3D map point spatial location. As described in the foregoing embodiment, the spatial position of the 3D map point may be represented by X, Y, Z on a three-dimensional spatial axis, or may be represented by longitude and latitude, altitude, or may be represented by polar coordinates. The plurality of attributes may further include identification, importance, acquisition time, acquisition equipment, additional text description, and the like of the 3D map points, and the embodiment of the present application does not limit the content and the number of the plurality of attributes.
In the embodiment of the present application, when a certain attribute of any one 3D map point of the plurality of 3D map points belongs to the plurality of attributes, the order of the any one 3D map point in the plurality of 3D map points is associated with the certain attribute.
Step 502, sorting the plurality of 3D map points according to the plurality of attributes of the plurality of 3D map points.
As in the previous embodiments, the plurality of 3D map points may be ordered according to the 3D map point descriptors of the plurality of 3D map points. The plurality of 3D map points may also be ordered according to 3D map point descriptors of the plurality of 3D map points and the 3D map point spatial locations.
In this embodiment of the present application, the plurality of 3D map points may be ordered based on the similarity of the plurality of attributes between the plurality of 3D map points, and it may be generally considered that the more similar the plurality of attributes, the higher the similarity between two 3D map points. Alternatively, the distance between any two 3D map points may be obtained according to a plurality of attributes between any two 3D map points in the plurality of 3D map points, and the distance may be used to represent the similarity of the plurality of attributes between the any two 3D map points, where a closer distance may represent a higher similarity or a lower similarity of the plurality of attributes, and the relationship between the distance and the similarity of the plurality of attributes is not limited in this embodiment of the present application.
For example, the distance corresponding to each attribute may be obtained according to each attribute of the plurality of attributes between any two 3D map points, and then the distance between any two 3D map points may be obtained by using the distance corresponding to each attribute of the plurality of attributes. The distance may include euclidean distance, hamming distance, manhattan distance, etc., and the specific distance acquisition method in the embodiment of the present application is not limited.
Taking a plurality of attributes including a 3D map point descriptor and a 3D map point space position, wherein the 3D map point descriptor is a vector, the 3D map point space position is represented by X, Y, Z on a three-dimensional space axis as an example, and the 3D map point descriptors of any two 3D map points are respectively a 1 And a 2 ,a 1 And a 2 At least one component may be included. The spatial positions of the 3D map points are b respectively 1 (x 1 ,y 1 ,z 1 ) And b 2 (x 2 ,y 2 ,z 2 ). The distance corresponding to the 3D map point descriptors of the arbitrary two 3D map points may be:the distance corresponding to the 3D map point spatial location may be: the distance D of any two 3D map points can then be derived using D1 and D2, e.g. d=c 1 d 1 +c 2 d 2 Pair c of embodiments of the present application 1 And c 2 The value of (c) is not limited, e.g. c 1 And c 2 May be 1. The plurality of attributes and the manner of calculating the distance between any two 3D map points according to the plurality of attributes are merely illustrative, and the embodiments of the present application do not limit the plurality of attributes and the manner of calculating the distance between two 3D map points.
This step 502 will be described below by taking an example in which the closer the distance is, the higher the similarity of the plurality of attributes is.
In one possible implementation, the plurality of 3D map points are arranged in a sequence, each element in the sequence corresponding to one 3D map point of the plurality of 3D map points. In the sequence, a plurality of attributes between two adjacent 3D map points are associated. Wherein the sequence may comprise: list, array, character string, etc., the embodiment of the present application does not limit the expression form of the sequence.
The embodiment of the application can find the 3D map point which is needed to be added to the sequence currently in at least one 3D map point which is not added to the sequence based on the 3D map point which is added to the sequence last (namely, the 3D map point corresponding to the element which is added to the sequence last). Alternatively, the distance between the at least one 3D map point of the non-joining sequence and the 3D map point of the last joining sequence may be determined according to the plurality of attributes of the at least one 3D map point of the non-joining sequence and the plurality of attributes of the 3D map point of the last joining sequence. And adding the 3D map points closest to the 3D map point of the last added sequence in the at least one 3D map point of the non-added sequence to the sequence, and arranging the 3D map points after the 3D map point of the last added sequence. The plurality of 3D map points contained in the 3D map includes at least one 3D map point not added to the sequence and at least one 3D map point added to the sequence. Therefore, in the sequence obtained according to the above process, the plurality of attributes between the two adjacent 3D map points are associated, and the similarity between the two 3D map points with the more similar plurality of attributes can be generally considered to be higher, so that the similarity between the two adjacent 3D map points can be at least ensured to be higher based on the plurality of 3D map points with the ordered plurality of attributes between the 3D map points, and the similarity between the plurality of 3D map points in the sequence is further integrally promoted.
By way of example, assume that the number of a plurality of 3D map points in a 3D map is n, n being a positive integer greater than 2. First, any 3D map point is determined as a starting 3D map point, and the starting 3D map point is added to the sequence. And determining the distances between the n-1 3D map points of the non-added sequence and the starting 3D map point according to the properties of the n-1 3D map points of the non-added sequence and the properties of the starting 3D map point, determining the 3D map point closest to the starting 3D map point as the 2 nd 3D map point, adding the 2 nd 3D map point into the sequence, and arranging the 2 nd 3D map point behind the starting 3D map point. And determining the distances between the n-2 3D map points of the non-added sequence and the 2 nd 3D map points according to the properties of the n-2 3D map points of the non-added sequence and the properties of the 2 nd 3D map points, determining the 3D map point closest to the 2 nd 3D map point as the 3 rd 3D map point, adding the 3 rd 3D map point into the sequence, and arranging the 3 rd 3D map point behind the 2 nd 3D map point. And analogizing until the nth 3D map point is added into the sequence and is arranged behind the (n-1) th 3D map point, so as to obtain n 3D map points arranged in a sequence mode.
Taking n=6 as an example, the plurality of 3D map points in the 3D map includes 6 3D map points P1 to P6, first, P3 is determined as a starting 3D map point, and P3 is added to the sequence. And determining the distances between the 5 3D map points without the sequence and P3 according to the attributes of the 5 3D map points without the sequence and the attributes of P3, adding the 3D map point P5 closest to the P3 into the sequence, and arranging the P5 behind the P3. And determining the distances between the 4 3D map points without the sequence and P5 according to the attributes of the 4 3D map points without the sequence and the attributes of P5, adding the 3D map point P2 closest to P5 into the sequence, and arranging behind P5. According to the attributes of the 3D map points without adding the sequence and the attributes of P2, the distances between the 3D map points without adding the sequence and P2 are determined, and the 3D map point P4 closest to the P2 is added to the sequence and is arranged behind the P2. According to the attributes of the 2 3D map points without adding the sequence and the attributes of the P4, the distances between the 2 3D map points without adding the sequence and the P4 are determined, and the 3D map point P6 closest to the P4 is added to the sequence and is arranged behind the P4. At this time, the 3D map points to which the sequence is not added include only 1 3D map point P1, and the 1 3D map points P1 may be directly added to the sequence and arranged after P6, to obtain 6 3D map points arranged in a sequence manner: p3→p5→p2→p4→p6→p1.
Fig. 6 is a schematic diagram of a sorting result provided in the embodiment of the present application, and fig. 6 shows the sorting result of the foregoing 6 3D map points P1 to P6, where the 6 3D map points are arranged in a sequential manner. As shown in fig. 6, the 6 points are sequenced in the order: P1→P2→P3→P4→P5→P6, and the sequence of the 6 points after sequencing is P3→P5→P2→P4→P6→P1.
In one possible implementation, the plurality of 3D map points are arranged in a topology tree, each node in the topology tree corresponding to one of the plurality of 3D map points. In the topology tree, a plurality of attributes are associated between a 3D map point located at a parent node position and a 3D map point located at a child node position. Wherein the topology tree may comprise: b tree, b+ tree, B tree, hash tree, etc., the embodiment of the present application does not limit the expression form of the topology tree.
The embodiment of the application can find the 3D map points which are required to be added to the topology tree currently from at least one 3D map point which is not added to the topology tree based on part or all of the 3D map points which are added to the topology tree. Alternatively, the distances between at least one 3D map point not added to the topology tree and at least one 3D map point added to the topology tree may be determined according to a plurality of attributes of a plurality of 3D map points. And adding 3D map points which are closest to at least one 3D map point added in the topology tree in the at least one 3D map point which is not added in the topology tree into the topology tree, and arranging the child nodes of the 3D map points closest to the at least one 3D map point added in the topology tree in parallel, wherein the 3D map points closest to the at least one 3D map point added in the topology tree are parent nodes of the 3D map points which are currently added in the child nodes. The plurality of 3D map points contained in the 3D map includes at least one 3D map point that has not joined the topology tree and at least one 3D map point that has joined the topology tree. Therefore, in the topology tree obtained according to the above process, the 3D map points located at the parent node position and the 3D map points located at the child node positions are associated with each other, and the similarity between the two 3D map points with the more similar attributes can be considered to be higher, so that the similarity between the two 3D map points located at the parent node position and the two 3D map points located at the child node positions can be at least ensured to be higher based on the 3D map points with the ordered plurality of attributes between the 3D map points, and the similarity of the plurality of 3D map points in the topology tree is further improved as a whole.
By way of example, assume that the number of a plurality of 3D map points in a 3D map is n. First, any 3D map point is determined as a starting 3D map point, and the starting 3D map point is added to the topology tree. And determining the distances between n-1 3D map points which are not added into the topology tree and the initial 3D map point respectively according to a plurality of attributes of the 3D map points, determining the 3D map point closest to the initial 3D map point as the 2 nd 3D map point, adding the 2 nd 3D map point into the topology tree, and arranging the 2 nd 3D map point at the child node position of the initial 3D map point. And determining the distances between n-2 3D map points which are not added into the topology tree and the starting 3D map point and the distances between the n-2 3D map points and the 2 nd 3D map point respectively according to a plurality of attributes of the 3D map points, obtaining 2 (n-2) distances, determining the 3D map point which is closest to the 2 (n-2) distances and is not added into the topology tree as the 3 rd 3D map point, adding the 3 rd 3D map point into the topology tree, and arranging the child node positions of the 3D map points which are closest to the 2 (n-2) distances and are added into the topology tree. And analogically, determining the distances between the nth 3D map point and n-1 3D map points added into the topology tree according to the attributes of the 3D map points to obtain n-1 distances, adding the nth 3D map point into the topology tree, and arranging the child node positions of the 3D map points which are closest to the n-1 distances and added into the topology tree to obtain the n 3D map points arranged in the topology tree mode.
Taking n=6 as an example, the plurality of 3D map points in the 3D map includes 6 3D map points S1 to S6, first, S3 is determined as a starting point, and S3 is added to the topology tree. And determining the distances between the 5 3D map points which are not added into the topological tree and the S3 according to the plurality of attributes of the 6 3D map points, obtaining 5 distances, wherein the distances between the S5 and the S3 in the 5 distances are nearest, adding the S5 into the topological tree, and arranging the S5 at the position of the child node of the S3. And determining the distance between the 4 3D map points which are not added into the sequence and S3 and the distance between the 4 3D map points and S5 respectively according to a plurality of attributes of the 6 3D map points, obtaining 8 distances, wherein the distance between S2 and S3 in the 8 distances is nearest, adding S2 into the topology tree, and arranging the sub-nodes at the position of S3. According to the multiple attributes of the 6 3D map points, determining the distance between the 3D map points which are not added into the sequence and S3, the distance between the 3D map points and S5 and the distance between the 3D map points and S2 respectively, obtaining 9 distances, wherein the distances between S1 and S2 in the 9 distances are nearest, adding the S1 into a topological tree, and arranging the topological tree at the position of a child node of the S2. According to the multiple attributes of the 6 3D map points, determining the distance between the 2 3D map points which are not added into the topological tree and S3, the distance between the 2D map points and S5, the distance between the 2D map points and S1 respectively, obtaining 8 distances, wherein the distance between the 6D map points and S1 in the 8 distances is nearest, adding the 6D map points into the topological tree, and arranging the 6D map points at the position of the child nodes of the S1. According to the multiple attributes of the 6 3D map points, determining the distance between the 1 3D map points S4 and S3 which are not added into the topological tree, the distance between the 1D map points S5, the distance between the 1D map points S2, the distance between the 1D map points S and the 6D map points S, obtaining 5 distances, wherein the distance between the 4D map points S and the 6D map points S is nearest, adding the 4D map points S into the topological tree, and arranging the distances between the 4D map points S and the 6D map points at the positions of the child nodes of the S6.
Fig. 7 is a schematic diagram of a sorting result provided in the embodiment of the present application, and fig. 7 shows the sorting result of the foregoing 6 3D map points S1 to S6, where the 6 3D map points are arranged in a topology tree manner. As shown in fig. 7, S3 is located at the parent node position of S5 and S2, S2 is located at the parent node position of S1, S1 is located at the parent node position of S6, and S6 is located at the parent node position of S4.
It should be noted that, when none of the plurality of 3D map points is added to the sequence or the topology tree, it is necessary to determine a starting point and add the starting point to the sequence or the topology tree. Alternatively, the starting point may be determined based on a plurality of attributes of a plurality of 3D map points, or one 3D map point randomly selected from a plurality of 3D map points may be determined as the starting point, and the selection of the starting point is not specifically limited in the embodiments of the present application. By way of example, taking an example in which the plurality of attributes include 3D map point spatial positions, a distance between each 3D map point and the reference position may be determined according to the 3D map point spatial position of each 3D map point among the plurality of 3D map points, and then the 3D map point closest to the reference position may be determined as the start point. The reference location may be a geometric center location of a plurality of 3D map points. Alternatively, any one of the 3D map points located on the edge position among the plurality of 3D map points may be determined as the start point according to the 3D map point space position of each of the plurality of 3D map points.
It should be noted that, in the foregoing embodiments, the 3D map points, the number of 3D map points, the sorting result, the sorting manner, and the like are merely exemplary and are not limited thereto.
Step 503, determining at least one reference 3D map point of the current 3D map point.
The plurality of 3D map points includes a current 3D map point and at least one reference 3D map point, and each of the at least one reference 3D map point is a 3D map point that has been encoded before the current 3D map point is encoded. The at least one reference 3D map point is used for predicting the current 3D map point, and the higher the similarity between the at least one reference 3D map point and the plurality of attributes of the current 3D map point is, the higher the similarity between the at least one reference 3D map point and the current 3D map point is, the better the prediction effect is, and the smaller the data volume of residual data obtained through subsequent prediction is.
In the embodiment of the present application, when the number of reference 3D map points is one, the higher the similarity between at least one reference 3D map point and a plurality of attributes of the current 3D map point, the smaller the data amount of residual data obtained by predicting the current 3D map point. When the number of the reference 3D map points is plural, the plural reference 3D map points in combination can realize more accurate prediction of the current 3D map point, thereby further reducing the data amount of the predicted residual data.
Several implementations are shown in the foregoing step 502, and the manner in which at least one reference 3D map point of the current 3D map point is determined is different for different implementations.
Corresponding to step 502, when a plurality of 3D map points are arranged in a sequential manner, in a first example, the number of reference 3D map points is one. Alternatively, the first 3D map point may be determined as at least one reference 3D map point (i.e., one reference 3D map point), the first 3D map point being a 3D map point arranged in the sequence before the current 3D map point. Taking the sequence shown in fig. 6 as an example, assuming that the current 3D map point is P2, the first 3D map point is P5. A first 3D map point of the plurality of 3D map points arranged in a sequential manner is associated with a plurality of attributes of the current 3D map point, i.e., the first 3D map point has a higher similarity to the current 3D map point. And determining the first 3D map point as a reference 3D map point, and then predicting the current 3D map point according to the first 3D map point, so that the prediction effect can be improved, and the data volume of residual data obtained by prediction can be effectively reduced.
In a second example, the number of reference 3D map points is a plurality. Alternatively, both the first 3D map point, which is a 3D map point arranged in the sequence before the current 3D map point, and the second 3D map point, which is a 3D map point arranged in the sequence before the first 3D map point, may be determined as at least one reference 3D map point (i.e., two reference 3D map points). Still taking the sequence shown in fig. 6 as an example, the first 3D map point is P5, and the second 3D map point is P3. A first 3D map point of the plurality of 3D map points arranged in a sequential manner is associated with the current 3D map point plurality of attributes and a second 3D map point is associated with the first 3D map point plurality of attributes. That is, the similarity between the first 3D map point and the current 3D map point is higher, the similarity between the second 3D map point and the first 3D map point is higher, and then the similarity between the second 3D map point and the current 3D map point is also higher. And determining the first 3D map point and the second 3D map point as reference 3D map points, and subsequently combining the first 3D map point and the second 3D map point to predict the current 3D map, so that the accuracy of prediction can be improved, and the data quantity of residual data obtained by prediction can be further reduced.
The selection of the reference 3D map points is merely exemplary, and alternatively, all the 3D map points arranged in the first n bits of the current 3D map point in the sequence may be determined as the reference 3D map points, n > 2, n may be 3, 4, 5, etc. Or determining at least one 3D map point which is arranged before the current 3D map point in the sequence and has a distance from the current 3D map point smaller than a distance threshold value as a reference 3D map point, and the number and the determination manner of the reference 3D map points are not limited in the embodiment of the present application. For example, when both the first 3D map point and the second 3D map point are determined as the reference 3D map point, a 3D map point arranged in the sequence before the second 3D map point may also be determined as the reference 3D map point.
Corresponding to step 502, when a plurality of 3D map points are arranged in a topology tree, in a first example, the number of reference 3D map points is one. Alternatively, a third 3D map point may be determined as at least one reference 3D map point (i.e., one reference 3D map point), the third 3D map point being a 3D map point located at a parent node position of the current 3D map point in the topology tree. Taking the topology tree shown in fig. 7 as an example, assuming that the current 3D map point is S6, the third 3D map point is S1. A third 3D map point of the plurality of 3D map points arranged in a topology tree manner is associated with a plurality of attributes of the current 3D map point, that is, the third 3D map point has a higher similarity with the current 3D map point. And determining the third 3D map point as a reference 3D map point, and then predicting the current 3D map point according to the third 3D map point, so that the prediction effect can be improved, and the data volume of residual data obtained by prediction can be effectively reduced.
In a second example, the number of reference 3D map points is a plurality. Alternatively, a third 3D map point and a fourth 3D map point may each be determined as at least one reference 3D map point, the third 3D map point being a 3D map point located at a parent node position of the current 3D map point in the topology tree, the fourth 3D map point being a 3D map point located at a parent node position of the third 3D map point in the topology tree. Still taking the topology tree shown in fig. 7 as an example, the third 3D map point is S1, and the fourth 3D map point is S2. A third 3D map point of the plurality of 3D map points arranged in a topology tree is associated with the current 3D map point plurality of attributes, and a fourth 3D map point is associated with the third 3D map point plurality of attributes. That is, the third 3D map point has higher similarity to the current 3D map point, the fourth 3D map point has higher similarity to the third 3D map point, and further the fourth 3D map point has higher similarity to the current 3D map point. And determining the third 3D map point and the fourth 3D map point as reference 3D map points, and subsequently combining the third 3D map point and the fourth 3D map point to predict the current 3D map, so that the accuracy of prediction can be improved, and the data quantity of residual data obtained by prediction can be further reduced.
The selection of the reference 3D map point is merely exemplary, and alternatively, the third 3D map point, the fourth 3D map point, and the 3D map point located at the parent node position of the fourth 3D map point may be determined as the reference 3D map point. Or determining the 3D map points, which are the third 3D map point, the fourth 3D map point and the 3D map point located at the parent node position of the fourth 3D map point, with the distance from the current 3D map point smaller than the distance threshold value as the reference 3D map points, wherein the number and the determination mode of the reference 3D map points are not limited in the embodiment of the present application.
And step 504, predicting the data of the current 3D map point according to the data of at least one reference 3D map point to obtain residual data of the current 3D map point.
The data of any 3D map point of the plurality of 3D map points includes at least a plurality of attributes. When the plurality of attributes of the 3D map point include at least a 3D map point descriptor, the data includes at least a 3D map point descriptor. When the plurality of attributes of the 3D map point include at least a 3D map point descriptor and a 3D map point spatial location, the data includes at least a 3D map point descriptor and a 3D map point spatial location. The data may further include content other than a plurality of attributes, and the embodiments of the present application do not limit the content and the amount included in the data.
In one possible implementation, the predicted value of the current 3D map point may be obtained from data of at least one reference 3D map point. And obtaining residual data of the current 3D map point according to the data of the current 3D map point and the obtained predicted value of the current 3D map point. For example, the data of the current 3D map point may be subtracted from the obtained predicted value of the current 3D map point to obtain the residual data of the current 3D map point. The obtaining manner of the predicted value may include at least one of the following: the method for obtaining the predicted value in the embodiment of the present application is not limited by taking an extremum, taking an average value, weighting and summing the data of at least one reference 3D map point, and the like. Taking the example that the data comprises 3D map point descriptors and 3D map point space positions, firstly, a predicted value of the 3D map point descriptors of the current 3D map point can be obtained based on at least one 3D map point descriptor referring to the 3D map point, and then the predicted value of the 3D map point descriptors of the current 3D map point and the predicted value of the 3D map point descriptors of the current 3D map point are subtracted to obtain residual data of the 3D map point descriptors of the current 3D map point. The method for acquiring the residual data of the 3D map point spatial position of the current 3D map point is similar to the method for acquiring the residual data of the 3D map point descriptor, that is, the predicted value is obtained first, and then the actual value and the predicted value are subtracted to obtain the residual value.
In one possible implementation, the residual data of the current 3D map point may be obtained directly from the data of the current 3D map point and the data of the at least one reference 3D map point. For example, the data of the current 3D map point may be subtracted from the data of the at least one reference 3D map point to obtain residual data of the current 3D map point. Taking the example that the data comprises a 3D map point descriptor and a 3D map point space position, the 3D map point descriptor of the current 3D map point and the 3D map point descriptor of at least one reference 3D map point can be subtracted to obtain residual data of the 3D map point descriptor of the current 3D map point. The method for obtaining the residual data of the 3D map point spatial position of the current 3D map point is similar to the method for obtaining the residual data of the 3D map point descriptor, namely, the real value and the reference value are subtracted to obtain the residual value.
In this embodiment of the present application, according to some or all elements included in the data of at least one reference 3D map point, the corresponding elements of the current 3D map point may be respectively predicted, so as to obtain residual data of each element in some or all elements. The residual data of the current 3D map point includes residual data of each of some or all of the elements.
For example, assuming that the data includes a 3D map point descriptor and a 3D map point spatial location, the 3D map point descriptor of the current 3D map point may be predicted from at least one 3D map point descriptor referring to the 3D map point to obtain residual data of the 3D map point descriptor of the current 3D map point. And predicting the 3D map point space position of the current 3D map point according to the at least one 3D map point space position of the reference 3D map point to obtain residual data of the 3D map point space position of the current 3D map point. The residual data of the current 3D map point includes residual data of the 3D map point descriptor and residual data of the 3D map point spatial location.
Step 504 is described below taking as an example that the data includes a 3D map point descriptor and a 3D map point spatial location, and subtracting the data of the current 3D map point from the data of at least one reference 3D map point to obtain residual data of the current 3D map point.
As described in the foregoing embodiment, the 3D map point descriptor may be a vector, and the spatial position of the 3D map point may be represented by X, Y, Z on a three-dimensional spatial axis, longitude and latitude, altitude, polar coordinates, or the like. In the embodiment of the present application, the spatial position of the 3D map point is represented by X, Y, Z on a three-dimensional spatial axis, which can obtain residual data of the 3D map point descriptor of the current 3D map point according to at least one 3D map point descriptor referring to the 3D map point, the 3D map point descriptor of the current 3D map point, and the first residual formula. And obtaining residual data of the 3D map point space position of the current 3D map point according to the at least one 3D map point space position of the reference 3D map point, the 3D map point space position of the current 3D map point and the second residual formula.
When the number of reference 3D map points is different, the first residual equation and the second residual equation are different. For example, when the number of reference 3D map points is one (e.g., the first 3D map point or the third 3D map point in the aforementioned step 503), the first residual equation may include:
r ai =m 1 a i -m 2 a i
wherein ra i Residual data of a 3D map point descriptor representing a current 3D map point, a i 3D map point descriptor, a, representing a current 3D map point i ' represents a 3D map point descriptor referring to a 3D map point. Refer to step 503, a, previously described i ' a 3D map point descriptor may represent the first 3D map point or the third 3D map point. The embodiment of the application is specific to m 1 And m 2 The value of (2) is not limited, for example, m 1 Or m 2 May be 1.
The second residual equation may include:
(r xi ,r yi ,r zi )=m 3 (x i ,y i ,z i )-m 4 (x i ′,y i ′,z i ′)
wherein (r) xi ,r yi ,r zi ) Residual data representing the 3D map point spatial location of the current 3D map point, (x) i ,y i ,z i ) 3D map point spatial location (x) representing the current 3D map point i ′,y i ′,z i ') represents the 3D map point spatial location of the reference 3D map point. Referring to step 503, (x) previously described i ′,y i ′,z i ') may represent the 3D map point spatial location of the first 3D map point or the third 3D map point. The embodiment of the application is specific to m 3 And m 4 The value of (2) is not limited, for example, m 3 Or m 4 May be 1.
When the number of reference 3D map points is plural, taking the number of reference 3D map points as two as an example, (e.g., the first 3D map point and the second 3D map point, or the third 3D map point and the fourth 3D map point in the foregoing step 503), the first residual equation may include:
r ai =n 1 a i -n 2 a i ′-n 3 a i
Wherein r is ai Residual data of a 3D map point descriptor representing a current 3D map point, a i 3D map point descriptor, a, representing a current 3D map point i ' 3D map point descriptor representing one reference 3D map point of two reference 3D map points (e.g., a reference 3D map point of the two reference 3D map points that is closer to the current 3D map point), a i "a 3D map point descriptor representing the other reference 3D map point of the two reference 3D map points (e.g., the reference 3D map point of the two reference 3D map points that is farther away from the current 3D map point). Refer to step 503, a, previously described i ' 3D map point descriptor, a, which may represent a first 3D map point i "3D map point descriptor representing the second 3D map point. Or a i ' 3D map point descriptor, a, which may represent a third 3D map point i "3D map point descriptor representing the fourth 3D map point. The embodiment of the application is to n 1 、n 2 And n 3 The value of (2) is not limited, for example, n 1 And n 2 Can be 1, n 3 Can be (-1).
The second residual equation may include:
(r xi ,r yi ,r zi )=n 4 (x i ,y i ,z i )-n 5 (x i ′,y i ′,z i ′)-n 6 (x i ″,y i ″,z i ″)
wherein (r) xi ,r yi ,r zi ) Residual data representing the 3D map point spatial location of the current 3D map point, (x) i ,y i ,z i ) 3D map point spatial location (x) representing the current 3D map point i ′,y i ′,z i ') represents a 3D map point spatial position of one reference 3D map point of the two reference 3D map points (e.g., a reference 3D map point of the two reference 3D map points that is closer to the current 3D map point), (x) i ″,y i ″,z i "is used to indicate the 3D map point spatial location of the other of the two reference 3D map points (e.g., the reference 3D map point of the two reference 3D map points that is farther from the current 3D map point). Referring to step 503, (x) previously described i ′,y i ′,z i ') may represent the 3D map point spatial location of the first 3D map point, (x) i ″,y i ″,z i "is used to indicate the 3D map point spatial location of the second 3D map point. Or (x) i ′,y i ′,z i ') may represent the 3D map point spatial location of the third 3D map point, (x) i ″,y i ″,z i "is used to indicate the 3D map point spatial location of the fourth 3D map point. The embodiment of the application is to n 4 、n 5 And n 6 The value of (2) is not limited, for example, n 4 And n 5 Can be 1, n 6 Can be (-1).
It should be noted that, the foregoing manner of predicting the data of the current 3D map point to obtain the residual data of the current 3D map point is merely illustrative, and the manner of predicting the data of the current 3D map point according to the data of the at least one reference 3D map point to obtain the residual data of the current 3D map point in the embodiment of the present application is not limited.
Step 505, the residual data of the current 3D map point is encoded to obtain a code stream.
Alternatively, the residual data of the current 3D map point may be subjected to encapsulation processing to obtain a code stream. After the code stream is obtained, the code stream can be stored, so that the storage space of the encoding device of the 3D map can be reduced. The code stream can also be sent to the decoding device, so that the data volume of the 3D map can be reduced, the bandwidth occupancy rate when the 3D map is transmitted is reduced, and the transmission efficiency of the 3D map is improved.
It should be noted that, in the foregoing embodiments, the description is given taking, as an example, that at least one reference 3D map point is determined first, then, the data of the current 3D map point is predicted according to the determined data of the at least one reference 3D map point to obtain the residual data of the current 3D map point, and then, the residual data of the current 3D map point is encoded to obtain the code stream. The procedure of the foregoing embodiment may be performed with each 3D map point of the plurality of 3D map points being the current 3D map point, respectively, to encode residual data of each 3D map point of the plurality of 3D map points.
The foregoing embodiments are described taking as an example the sorting of a plurality of 3D map points included in a 3D map. Alternatively, the plurality of 3D map points may not be ordered, i.e., the foregoing step 502 may not be performed. At this time, any at least one 3D map point among the encoded 3D map points may be determined as a reference 3D map point.
Alternatively, after sorting the plurality of 3D map points, the encoding apparatus may not determine at least one reference 3D map point of the current 3D map point, i.e., the aforementioned step 503 may not be performed. At this time, in step 504, the encoding apparatus may directly predict the data of the current 3D map point according to the data of at least one reference 3D map point defined by the preset rule. For example, when a plurality of 3D map points are arranged in a sequence, the preset rule may include determining a first 3D map point as a reference 3D map point, the first 3D map point being a 3D map point arranged in the sequence in front of a current 3D map point; or determining both a first 3D map point and a second 3D map point as reference 3D map points, the first 3D map point being a 3D map point arranged in the sequence in front of the current 3D map point, the second 3D map point being a 3D map point arranged in the sequence in front of the first 3D map point; or determining all the 3D map points which are arranged in the first n bits of the current 3D map point in the sequence as reference 3D map points, wherein n is more than 2. When the plurality of 3D map points are arranged in a topology tree, the preset rule may include determining a third 3D map point, which is a 3D map point located at a parent node position of the current 3D map point in the topology tree, as a reference 3D map point; or determining a third 3D map point and a fourth 3D map point as reference 3D map points, wherein the third 3D map point is a 3D map point positioned at the parent node position of the current 3D map point in the topology tree, and the fourth 3D map point is a 3D map point positioned at the parent node position of the third 3D map point in the topology tree; or determining the third 3D map point, the fourth 3D map point, and the 3D map point located at the parent node position of the fourth 3D map point as the reference 3D map point, where the preset rule may refer to step 503, and details of this embodiment are not described herein.
It should be noted that, the encoding device needs to obtain residual data of the current 3D map point according to at least one reference 3D map point, and then encode the residual data to obtain a code stream. After the code stream is sent to the decoding device, the decoding device needs to obtain the data of the current 3D map point according to at least one reference 3D map point after decoding to obtain the residual data of the current 3D map point. It is therefore necessary to ensure that the decoding device coincides with at least one reference 3D map point of the encoding device. The following describes an implementation procedure for ensuring that the decoding device coincides with at least one reference 3D map point of the encoding device.
In a possible implementation manner, when the encoding device does not determine at least one reference 3D map point of the current 3D map point, i.e. does not perform the foregoing step 503, the encoding device does not need to send content related to the at least one reference 3D map point to the decoding device, and the decoding device may directly obtain the data of the current 3D map point according to the at least one reference 3D map point defined by the preset rule.
In one possible implementation, when the encoding apparatus determines any at least one 3D map point of the encoded 3D map points as at least one reference 3D map point, the encoding apparatus may transmit indication information of the at least one reference 3D map point to the decoding apparatus. Alternatively, the encoding device may send the indication information of the at least one reference 3D map point to the decoding device separately, or may add the indication information of the at least one reference 3D map point to a code stream obtained by encoding residual data of the current 3D map point, which is not limited in the manner of sending the indication information of the at least one reference 3D map point. For example, the indication information of the at least one reference 3D map point may include an identification of the reference 3D map point, an order of the reference 3D map point among the plurality of 3D map points, an order relationship or topology relationship of the reference 3D map point with the current 3D map point, and the like.
In one possible implementation manner, when the plurality of 3D map points are arranged in a sequential manner, the encoding device may send the sequence and at least one determination rule referencing the 3D map points to the decoding device, and optionally, the determination rule may refer to the foregoing preset rule, which is not described herein in detail.
Or the encoding device directly sends the indication information of at least one reference 3D map point to the decoding device, and the process of sending the indication information may refer to the foregoing implementation manner, which is not described herein in detail.
Or when the encoding device encodes the 3D map points, the encoding device encodes the 3D map points according to the sequence of the 3D map points in sequence, and the sequence of the residual data of the 3D map points in the code stream received by the decoding device is the same as the sequence of the 3D map points in the sequence. Therefore, the encoding device may only send the determining rule of the at least one reference 3D map point to the decoding device, and the determining rule may refer to the foregoing embodiment, which is not described herein.
In a possible implementation manner, when a plurality of 3D map points are arranged according to a topology tree, the encoding device needs to encode the structure of the topology tree to obtain a code stream of the topology tree, and send the code stream of the topology tree and a determination rule of at least one reference 3D map point to the decoding device, where the determination rule may refer to the foregoing preset rule, and this embodiment of the present application will not be described herein. Alternatively, the encoding device may send the code stream of the topology tree to the decoding device alone, or may add the code stream of the topology tree to the code stream obtained by encoding the residual data of the current 3D map point, which is not limited in the manner of sending the topology tree in the embodiment of the present application.
Or the encoding device may directly send the indication information of at least one reference 3D map point to the decoding device, and the process of sending the indication information may refer to the foregoing implementation manner, which is not described herein in detail.
In the embodiment of the present application, other attribute information except for a plurality of attributes of the current 3D map point may also be predicted according to the order of the current 3D map point in the sequence to obtain a code stream. The process may refer to the foregoing embodiments, and the embodiments of the present application will not be described again.
In summary, in the method for encoding a 3D map provided in the embodiment of the present application, a plurality of 3D map points included in a 3D map are obtained, the plurality of 3D map points have cross-modal correlation, the cross-modal correlation of the plurality of 3D map points means that the order of any one 3D map point among the plurality of 3D map points in the plurality of 3D map points is associated with a plurality of attributes thereof, and then the data of the current 3D map point is predicted according to the data of at least one reference 3D map point to obtain residual data of the current 3D map point, and then the residual data of the current 3D map point is encoded to obtain a code stream. The data volume of the residual data is small, and the storage space of the encoding device for storing the 3D map can be reduced when the encoding device stores the code stream; the encoding device can reduce the bandwidth occupancy rate when transmitting the 3D map and improve the transmission efficiency of the 3D map when transmitting the code stream.
In addition, the 3D map points can be ranked according to the attributes of the 3D map points, and then at least one reference 3D map point is determined based on the ranked 3D map points, so that the similarity between the at least one reference point and the current 3D map point can be increased, the predicted residual data amount is reduced, the data amount of the 3D map is further reduced, the storage space of the 3D map stored by the encoding device is further reduced, or the bandwidth occupancy rate when the 3D map is transmitted is further reduced, and the transmission efficiency of the 3D map is further improved.
The sequence of the method provided by the embodiment of the application can be properly adjusted, and the steps can be correspondingly increased or decreased according to the situation. Any method of modification, which is within the scope of the present disclosure, will be readily apparent to those skilled in the art, and is intended to be encompassed within the scope of the present disclosure. For example, the foregoing step 502 and/or step 503 may not be performed, which is not limited in this embodiment of the present application.
Fig. 8 is a flowchart of a process 800 of the decoding method of the 3D map according to the embodiment of the present application, as shown in fig. 8, where the process 800 may be performed by a decoding apparatus, and the decoding apparatus may be applied to a server or an electronic device in the above embodiment, for example, an electronic device in the embodiment shown in fig. 4a, or a second electronic device in the embodiment shown in fig. 4D and 4 f. Process 800 is described as a series of steps or operations, it being understood that process 800 may be performed in various orders and/or concurrently, and is not limited to the order of execution as depicted in fig. 8. Assuming that a code stream obtained by compression processing of data of a 3D map in an encoding apparatus is received by a decoding apparatus, and decoding processing is performed in the decoding apparatus to obtain data of the 3D map, a process 800 including the following steps is performed to process the code stream of the 3D map currently being processed.
Step 801, decoding the code stream to obtain residual data of a current 3D map point, where the current 3D map point belongs to a 3D map, and the 3D map includes a plurality of 3D map points.
Alternatively, the inverse process of encoding the code stream may be performed to obtain residual data of the current 3D map point. Alternatively, when the encoding device performs the encapsulation processing on the residual data, the decoding device may perform the decapsulation processing on the code stream to obtain the residual data.
Step 802, determining at least one reference 3D map point of the current 3D map point.
The plurality of 3D map points includes at least one reference 3D map point and the current 3D map point, and each of the at least one reference 3D map point is a 3D map point that has been decoded before the current 3D map point is decoded.
The at least one reference 3D map point determined by the decoding means needs to coincide with the at least one reference 3D map point determined by the encoding means.
In one possible implementation, when the plurality of 3D map points are arranged in a sequence, if the decoding apparatus receives the sequence and the determination rule of the at least one reference 3D map point, the decoding apparatus may determine the at least one reference 3D map point according to the sequence and the determination rule. For example, the determination rule may be to determine a first 3D map point as a reference 3D map point, the first 3D map point being a last decoded 3D map point of the current 3D map point (i.e., a 3D map point arranged in the sequence in front of the current 3D map point). For example, assume the sequence is: p3→p5→p2→p4→p6→p5, the current 3D map point is P4, and the decoding apparatus can determine P2 as the reference 3D map point. Or the determination rule may be to determine both a first 3D map point, which is a last decoded 3D map point of the current 3D map point, and a second 3D map point, which is a last decoded 3D map point of the first 3D map point (i.e., a 3D map point arranged in the sequence before the first 3D map point), as reference 3D map points. For example, assume the sequence is: p3→p5→p2→p4→p6→p5, the current 3D map point is P4, and the decoding apparatus can determine P2 and P5 as the reference 3D map points.
If the decoding apparatus receives the indication information of the at least one reference 3D map point, the decoding apparatus may determine the at least one reference 3D map point directly according to the indication information.
If the decoding apparatus receives only the determination rule of at least one reference 3D map point, the decoding apparatus may determine the at least one reference 3D map point according to the ordering of the residual data of the 3D map points in the received code stream and the determination rule. The determining rule and the determining manner may refer to the foregoing embodiments, and the embodiments of the present application are not described herein.
In one possible implementation manner, when the plurality of 3D map points are arranged according to the topology tree, if the decoding device receives the code stream of the topology tree, the decoding device may decode the code stream of the topology tree to obtain the structure of the topology tree, and determine at least one reference 3D map point according to the structure of the topology tree and the determination rule. For example, the determination rule may be to determine a third 3D map point as a reference 3D map point, the third 3D map point being a 3D map point located at a parent node position of the current 3D map point in the topology tree. For example, assuming that the structure of the topology tree is as shown in fig. 7, the current 3D map point is S6, the decoding apparatus may determine S1 as the reference 3D map point. Or the determination rule may be to determine both a third 3D map point and a fourth 3D map point as reference 3D map points, the third 3D map point being a 3D map point located at a parent node position of the current 3D map point in the topology tree, the fourth 3D map point being a 3D map point located at a parent node position of the third 3D map point in the topology tree. For example, assuming that the structure of the topology tree is as shown in fig. 7, the current 3D map point is S6, the decoding apparatus may determine S1 and S2 as the reference 3D map points.
If the decoding apparatus receives the indication information of the at least one reference 3D map point, the decoding apparatus may determine the at least one reference 3D map point directly according to the indication information.
In one possible implementation, if the decoding apparatus receives the indication information of the at least one reference 3D map point, the decoding apparatus may determine the at least one reference 3D map point directly according to the indication information.
The process may refer to the foregoing encoding-side embodiment, and the embodiments of the present application are not described herein.
Step 803, obtaining the data of the current 3D map point according to the data of at least one reference 3D map point and the residual data of the current 3D map point.
The data includes at least 3D map point descriptors. The data for any one of the plurality of 3D map points is consistent with the data for any one of the 3D map points in step 504. The data may refer to the foregoing step 504, and the embodiments of the present application are not described herein.
In a corresponding step 504, a predicted value of the current 3D map point is obtained from the data of the at least one reference 3D map point. And obtaining the residual data of the current 3D map point according to the data of the current 3D map point and the obtained predicted value of the current 3D map point, wherein the decoding device can obtain the predicted value of the current 3D map point according to the data of at least one reference 3D map point. And obtaining the data of the current 3D map point according to the residual data of the current 3D map point and the obtained predicted value of the current 3D map point. For example, when the data of the current 3D map point is subtracted from the obtained predicted value of the current 3D map point to obtain the residual data of the current 3D map point in step 504, the residual data of the current 3D map point may be added to the obtained predicted value of the current 3D map point to obtain the data of the current 3D map point in step 803. Taking the example that the data comprises 3D map point descriptors and 3D map point space positions, firstly, a predicted value of the 3D map point descriptors of the current 3D map point can be obtained based on at least one 3D map point descriptor referring to the 3D map point, and then, the residual data of the 3D map point descriptors of the current 3D map point and the predicted value of the 3D map point descriptors of the current 3D map point are added to obtain the 3D map point descriptors of the current 3D map point. The method for acquiring the 3D map point space position of the current 3D map point is similar to the method for acquiring the 3D map point descriptors, namely, a predicted value is obtained first, and then the residual value and the predicted value are added to obtain a true value.
Corresponding to the implementation manner in which the residual data of the current 3D map point is obtained directly according to the data of the current 3D map point and the data of the at least one reference 3D map point in step 504, the decoding device may obtain the data of the current 3D map point according to the residual data of the current 3D map point and the data of the at least one reference 3D map point. For example, when the data of the current 3D map point is subtracted from the data of the at least one reference 3D map point to obtain the residual data of the current 3D map point in step 504, the residual data of the current 3D map point may be added to the data of the at least one reference 3D map point to obtain the data of the current 3D map point in step 803. Taking the example that the data includes a 3D map point descriptor and a 3D map point spatial position, the residual data of the 3D map point descriptor of the current 3D map point may be added to at least one 3D map point descriptor referring to the 3D map point to obtain the 3D map point descriptor of the current 3D map point. The method for obtaining the 3D map point spatial position of the current 3D map point is similar to the method for obtaining the 3D map point descriptor, namely, the residual value is added with the reference value to obtain the true value.
In the embodiment of the present application, a part or all of the elements included in the data of the current 3D map point may be obtained according to a part or all of the elements included in the data of the at least one reference 3D map point and the residual data of the corresponding elements.
For example, assuming that the data includes a 3D map point descriptor and a 3D map point spatial location, the 3D map point descriptor of the current 3D map point may be acquired from residual data of the 3D map point descriptor of the current 3D map point and at least one 3D map point descriptor referring to the 3D map point. And acquiring the 3D map point space position of the current 3D map point according to the residual data of the 3D map point space position of the current 3D map point and the 3D map point space position of at least one reference 3D map point. The data of the current 3D map point includes a 3D map point descriptor and a 3D map point spatial location.
According to the embodiment of the invention, the data comprise the 3D map point descriptor and the 3D map point space position, correspondingly, the residual data of the current 3D map point comprise the residual data of the 3D map point descriptor and the residual data of the 3D map point space position, the 3D map point space position is represented by X, Y, Z on a three-dimensional space axis as an example, and the 3D map point descriptor of the current 3D map point can be obtained according to the residual data of the 3D map point descriptor of the current 3D map point, the 3D map point descriptor of at least one reference 3D map point and the first residual formula. And obtaining the 3D map point space position of the current 3D map point according to the residual data of the 3D map point space position of the current 3D map point, the 3D map point space position of at least one reference 3D map point and the second residual formula.
Corresponding to the foregoing step 504, when the number of reference 3D map points is one (for example, the first 3D map point or the third 3D map point in the foregoing step 503), the 3D map point descriptor of the current 3D map point may be:
a i =(m 2 a i ′+r ai )/m 1
the 3D map point spatial location of the current 3D map point may be:
(x i ,y i ,z i )=[m 4 (x i ′,y i ′,z i ′)+(r xi ,r yi ,r zi )]/m 3
when the number of reference 3D map points is plural, taking the number of reference 3D map points as two as an example, (for example, the first 3D map point and the second 3D map point, or the third 3D map point and the fourth 3D map point in the foregoing step 503), the 3D map point descriptor of the current 3D map point may be:
a i =(n 2 a i ′+n 3 a i ″+r ai )/n 1
the 3D map point spatial location of the current 3D map point may be:
(x i ,y i ,z i )=[n 5 (x i ′,y i ′,z i ′)+n 6 (x i ″,y i ″,z i ″)+(r xi ,r yi ,r zi )]/n 4
the definition of each parameter in the first residual equation and the second residual equation in this process may refer to the foregoing step 504, and the embodiments of the present application will not be described herein.
In the embodiment of the application, other attribute information except for a plurality of attributes of the current 3D map point can be obtained according to the received code stream. The process may refer to the foregoing embodiments, and the embodiments of the present application will not be described again.
In this embodiment of the present application, when the encoding device does not determine at least one reference 3D map point of the current 3D map point, but directly predicts the data of the current 3D map point according to the data of the at least one reference 3D map point defined by the preset rule, that is, the encoding device does not perform the foregoing step 503, the decoding device may not determine the at least one reference 3D map point of the current 3D map point, that is, the foregoing step 802 may not be performed. At this time, in step 803, the decoding apparatus acquires data of the current 3D map point according to the residual data of the current 3D map point and the data of at least one reference 3D map point defined by the preset rule. The process of obtaining may refer to the foregoing embodiments, and the embodiments of the present application are not described herein.
In summary, according to the decoding method of the 3D map provided by the embodiment of the present application, the code stream is decoded to obtain the residual data of the current 3D map point, at least one reference 3D map point of the current 3D map point is determined, then the data of the current 3D map point is obtained according to the data of the at least one reference 3D map point and the residual data of the current 3D map point, the received code stream is the code stream of the residual data of the current 3D map point, the data size of the residual data is smaller, the bandwidth occupation rate when the 3D map is transmitted to the decoding device can be reduced, and the transmission efficiency of the 3D map is improved.
In addition, at least one reference 3D map point of the current 3D map point may be determined first, then, data of the current 3D map point may be obtained according to the determined data of the at least one reference 3D map point and residual data of the current 3D map point, where the at least one reference 3D map point is consistent with the at least one reference 3D map point determined by the decoding device, and the at least one reference 3D map point may be obtained by the encoding device by sorting a plurality of 3D map points included in the 3D map first, and then, based on the sorted plurality of 3D map points, a similarity between the at least one reference 3D map point and the current 3D map point may be increased, so as to reduce a predicted residual data amount, further reduce a data amount of the 3D map, thereby further reducing a bandwidth occupancy rate when the 3D map is transmitted to the decoding device, and further improving a transmission efficiency of the 3D map.
The sequence of the method provided in the embodiments of the present application may be appropriately adjusted, or the steps may be increased or decreased according to the situation, and the foregoing step 802 may not be executed. Any person skilled in the art will readily recognize that the modified method is within the scope of the present disclosure, and the embodiments of the present disclosure should not be limited thereto.
Fig. 9 is a block diagram of an encoding apparatus 90 for a 3D map according to an embodiment of the present application, and as shown in fig. 9, the encoding apparatus 90 may be applied to an electronic device or a server in the above embodiment, particularly, a device having a need to compress and transmit a 3D map, for example, the server in the embodiment shown in fig. 4a, or the first electronic device in the embodiment shown in fig. 4D to 4 f. The encoding apparatus 90 of the embodiment of the present application may include: a prediction module 91 and a packaging module 92, wherein,
a prediction module 91, configured to obtain a plurality of 3D map points included in the 3D map, where the plurality of 3D map points have cross-modal correlation, and the plurality of 3D map points have cross-modal correlation, that is, an order of any one 3D map point in the plurality of 3D map points is associated with a plurality of attributes thereof, where the plurality of attributes of the 3D map points include at least 3D map point descriptors; predicting the data of the current 3D map point according to the data of the at least one reference 3D map point to obtain residual data of the current 3D map point, wherein the data at least comprises a plurality of attributes, the plurality of 3D map points comprise the current 3D map point and at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point coded before the current 3D map point is coded.
And the encapsulation module 92 is used for encoding the residual data of the current 3D map point to obtain a code stream.
In one possible implementation, the plurality of attributes of the 3D map points further includes a 3D map point spatial location.
In a possible implementation, the prediction module 91 is further configured to sort the plurality of 3D map points according to a plurality of attributes of the plurality of 3D map points.
In one possible implementation, the plurality of 3D map points are arranged in a sequence, each element in the sequence corresponding to a respective one of the plurality of 3D map points; in the sequence, a plurality of attributes between two adjacent 3D map points are associated.
In one possible implementation, the plurality of 3D map points are arranged in a topology tree, each node in the topology tree corresponding to one 3D map point of the plurality of 3D map points; in the topology tree, a plurality of attributes are associated between a 3D map point located at a parent node position and a 3D map point located at a child node position.
In a possible implementation, the prediction module 91 is further configured to determine at least one reference 3D map point.
In a possible implementation, when a plurality of 3D map points are arranged in a sequence, the prediction module 91 is specifically configured to determine a first 3D map point as at least one reference 3D map point, where the first 3D map point is a 3D map point arranged in the sequence before the current 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a sequence, the prediction module 91 is specifically configured to determine each of the first 3D map point and the second 3D map point as at least one reference 3D map point, where the first 3D map point is a 3D map point arranged in the sequence before the current 3D map point, and the second 3D map point is a 3D map point arranged in the sequence before the first 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a topology tree, the prediction module 91 is specifically configured to determine a third 3D map point as at least one reference 3D map point, where the third 3D map point is a 3D map point located at a parent node position of the current 3D map point in the topology tree.
In one possible implementation, when the plurality of 3D map points are arranged in a topology tree, the prediction module 91 is specifically configured to determine each of a third 3D map point and a fourth 3D map point as at least one reference 3D map point, where the third 3D map point is a 3D map point located at a parent node position of the current 3D map point in the topology tree, and the fourth 3D map point is a 3D map point located at a parent node position of the third 3D map point in the topology tree.
In a possible implementation manner, the prediction module 91 is specifically configured to obtain a predicted value of the current 3D map point according to the data of the at least one reference 3D map point; and subtracting the obtained predicted value of the current 3D map point from the data of the current 3D map point to obtain residual data of the current 3D map point.
In one possible implementation, the prediction module 91 is specifically configured to predict the 3D map point descriptor of the current 3D map point according to at least one 3D map point descriptor of the reference 3D map point to obtain residual data of the 3D map point descriptor of the current 3D map point. And predicting the 3D map point space position of the current 3D map point according to the at least one 3D map point space position of the reference 3D map point to obtain residual data of the 3D map point space position of the current 3D map point.
In one possible implementation, the code stream further includes at least one indication information referring to the 3D map points.
In summary, in the 3D map encoding device provided in the embodiment of the present application, a prediction module is used to obtain a plurality of 3D map points included in a 3D map, the plurality of 3D map points have cross-modal correlation, the cross-modal correlation of the plurality of 3D map points means that the order of any one 3D map point in the plurality of 3D map points is associated with a plurality of attributes thereof, and then the prediction module predicts the data of the current 3D map point according to the data of at least one reference 3D map point to obtain residual data of the current 3D map point, and then the encapsulation module encodes the residual data of the current 3D map point to obtain a code stream. The data volume of the residual data is small, and the storage space of the encoding device for storing the 3D map can be reduced when the encoding device stores the code stream; the encoding device can reduce the bandwidth occupancy rate when transmitting the 3D map and improve the transmission efficiency of the 3D map when transmitting the code stream.
In addition, the prediction module can sort the 3D map points according to the attributes of the 3D map points, and then determine at least one reference 3D map point based on the sorted 3D map points, so that the similarity between the at least one reference point and the current 3D map point can be increased, the predicted residual data amount can be reduced, the data amount of the 3D map can be further reduced, the storage space of the 3D map stored by the encoding device can be further reduced, or the bandwidth occupancy rate when the 3D map is transmitted can be further reduced, and the transmission efficiency of the 3D map can be further improved.
The encoding device of the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in fig. 5, and its implementation principle and technical effects are similar, and are not described herein again.
Fig. 10 is a block diagram of a decoding apparatus 100 for a 3D map according to an embodiment of the present application, and as shown in fig. 10, the decoding apparatus 100 may be applied to an electronic device or a server in the above embodiment, particularly, a device that needs to receive and decompress a 3D map, for example, an electronic device in the embodiment shown in fig. 4a, or a second electronic device in the embodiment shown in fig. 4D and 4 f. The encoding apparatus 100 of the embodiment of the present application may include: a decapsulation module 101 and a prediction module 102, wherein,
The decapsulation module 101 is configured to decode the code stream to obtain residual data of a current 3D map point, where the current 3D map point belongs to a 3D map, and the 3D map includes a plurality of 3D map points.
A prediction module 102, configured to obtain data of a current 3D map point according to the data of at least one reference 3D map point and residual data of the current 3D map point, where the data includes at least a 3D map point descriptor, and the plurality of 3D map points includes at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point decoded before decoding the current 3D map point.
In one possible implementation, the data further includes a 3D map point spatial location.
In a possible implementation manner, the prediction module 102 is specifically configured to obtain a predicted value of the current 3D map point according to the data of the at least one reference 3D map point; and adding the residual data of the current 3D map point and the obtained predicted value of the current 3D map point to obtain the data of the current 3D map point.
In one possible implementation, the prediction module 102 is further configured to determine at least one reference 3D map point.
In one possible implementation, when a plurality of 3D map points are arranged in a sequential manner, the prediction module 102 is specifically configured to determine a first 3D map point as at least one reference 3D map point, where the first 3D map point is a last decoded 3D map point of the current 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a sequential manner, the prediction module 102 is specifically configured to determine each of the first 3D map point and the second 3D map point as at least one reference 3D map point, where the first 3D map point is a last decoded 3D map point of the current 3D map point, and the second 3D map point is a last decoded 3D map point of the first 3D map point.
In one possible implementation, when the plurality of 3D map points are arranged in a topology tree, the prediction module 102 is specifically configured to determine a third 3D map point as at least one reference 3D map point, where the third 3D map point is a 3D map point located at a parent node position of the current 3D map point in the topology tree.
In one possible implementation, when the plurality of 3D map points are arranged in a topology tree, the prediction module 102 is specifically configured to determine each of a third 3D map point and a fourth 3D map point as at least one reference 3D map point, where the third 3D map point is a 3D map point located at a parent node position of the current 3D map point in the topology tree, and the fourth 3D map point is a 3D map point located at a parent node position of the third 3D map point in the topology tree.
In one possible implementation manner, the prediction module 102 is specifically configured to obtain a 3D map point descriptor of the current 3D map point according to residual data of a 3D map point descriptor of the current 3D map point and at least one 3D map point descriptor of the reference 3D map point; and acquiring the 3D map point space position of the current 3D map point according to the residual data of the 3D map point space position of the current 3D map point and the 3D map point space position of at least one reference 3D map point.
In summary, in the decoding device for a 3D map provided in the embodiment of the present application, the decapsulation module decodes the code stream to obtain the residual data of the current 3D map point, the prediction module determines at least one reference 3D map point of the current 3D map point, and then the prediction module obtains the data of the current 3D map point according to the data of the at least one reference 3D map point and the residual data of the current 3D map point, where the received code stream is the code stream of the residual data of the current 3D map point, and the data size of the residual data is smaller, so that the bandwidth occupancy rate when the 3D map is transmitted to the decoding device can be reduced and the transmission efficiency of the 3D map is improved.
In addition, at least one reference 3D map point of the current 3D map point can be determined by the prediction module, then the data of the current 3D map point is obtained according to the determined data of the at least one reference 3D map point and the residual data of the current 3D map point, the at least one reference 3D map point is consistent with the at least one reference 3D map point determined by the decoding device, the at least one reference 3D map point can be obtained by the encoding device by sorting a plurality of 3D map points included in the 3D map, and then the similarity between the at least one reference 3D map point and the current 3D map point is obtained based on the sorted plurality of 3D map points, so that the predicted residual data volume is reduced, the data volume of the 3D map is further reduced, the bandwidth occupancy rate when the 3D map is transmitted to the decoding device is further reduced, and the transmission efficiency of the 3D map is further improved.
The decoding device of the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in fig. 8, and its implementation principle and technical effects are similar, and are not described herein again.
In implementation, the steps of the above method embodiments may be implemented by integrated logic circuits of hardware in a processor or instructions in software form. The processor may be a general purpose processor, a digital signal processor (digital signal processor, DSP), an Application Specific Integrated Circuit (ASIC), a field programmable gate array (field programmable gate array, FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented as a hardware encoding processor executing, or may be implemented by a combination of hardware and software modules in the encoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
The memory mentioned in the above embodiments may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous DRAM (SLDRAM), and direct memory bus RAM (DR RAM). It should be noted that the memory of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (personal computer, server, network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (56)

  1. A method of encoding a 3D map, comprising:
    acquiring a plurality of 3D map points contained in a 3D map, wherein the 3D map points have cross-modal correlation, the cross-modal correlation of the 3D map points means that the sequence of any 3D map point in the 3D map points is associated with a plurality of attributes of the 3D map points, and the attributes of the 3D map points at least comprise 3D map point descriptors;
    predicting data of a current 3D map point according to data of at least one reference 3D map point to obtain residual data of the current 3D map point, wherein the data at least comprises the plurality of attributes, the plurality of 3D map points comprise the current 3D map point and the at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point coded before the current 3D map point is coded;
    And encoding residual data of the current 3D map point to obtain a code stream.
  2. The method of claim 1, wherein the plurality of attributes of the 3D map points further comprise 3D map point spatial locations.
  3. The method according to claim 1 or 2, characterized in that the method further comprises:
    the plurality of 3D map points are ordered according to the plurality of attributes of the plurality of 3D map points.
  4. A method according to any one of claims 1-3, wherein the plurality of 3D map points are arranged in a sequence, each element in the sequence corresponding to a respective one of the plurality of 3D map points; in the sequence, the plurality of attributes between two adjacent 3D map points are associated.
  5. A method according to any one of claims 1-3, wherein the plurality of 3D map points are arranged in a topology tree, each node in the topology tree corresponding to a respective one of the plurality of 3D map points; in the topology tree, the plurality of attributes are associated between a 3D map point located at a parent node position and a 3D map point located at a child node position.
  6. The method according to any one of claims 1-5, further comprising:
    The at least one reference 3D map point is determined.
  7. The method of claim 6, wherein the determining the at least one reference 3D map point when the plurality of 3D map points are arranged in a sequential manner comprises:
    a first 3D map point is determined as the at least one reference 3D map point, the first 3D map point being a 3D map point arranged in the sequence preceding the current 3D map point.
  8. The method of claim 6, wherein the determining the at least one reference 3D map point when the plurality of 3D map points are arranged in a sequential manner comprises:
    determining a first 3D map point and a second 3D map point as the at least one reference 3D map point, the first 3D map point being a 3D map point arranged in the sequence before the current 3D map point, the second 3D map point being a 3D map point arranged in the sequence before the first 3D map point.
  9. The method of claim 6, wherein the determining the at least one reference 3D map point when the plurality of 3D map points are arranged in a topology tree, comprises:
    A third 3D map point is determined as the at least one reference 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node position of the current 3D map point.
  10. The method of claim 6, wherein the determining the at least one reference 3D map point when the plurality of 3D map points are arranged in a topology tree, comprises:
    determining a third 3D map point and a fourth 3D map point as the at least one reference 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node position of the current 3D map point, the fourth 3D map point being a 3D map point located in the topology tree at a parent node position of the third 3D map point.
  11. The method according to any one of claims 1-10, wherein predicting the data of the current 3D map point from the data of at least one reference 3D map point results in residual data of the current 3D map point, comprising:
    acquiring a predicted value of the current 3D map point according to the data of the at least one reference 3D map point;
    and subtracting the obtained predicted value of the current 3D map point from the data of the current 3D map point to obtain residual data of the current 3D map point.
  12. The method according to any one of claims 2-11, wherein predicting the data of the current 3D map point from the data of at least one reference 3D map point results in residual data of the current 3D map point, comprising:
    predicting the 3D map point descriptor of the current 3D map point according to the at least one 3D map point descriptor of the reference 3D map point to obtain residual data of the 3D map point descriptor of the current 3D map point;
    and predicting the 3D map point space position of the current 3D map point according to the 3D map point space position of the at least one reference 3D map point to obtain residual data of the 3D map point space position of the current 3D map point.
  13. The method according to any of claims 1-12, wherein the code stream further comprises indication information of the at least one reference 3D map point.
  14. A method of decoding a 3D map, comprising:
    decoding the code stream to obtain residual data of a current 3D map point, wherein the current 3D map point belongs to a 3D map, and the 3D map comprises a plurality of 3D map points;
    acquiring data of the current 3D map point according to the data of at least one reference 3D map point and residual data of the current 3D map point, wherein the data at least comprises a 3D map point descriptor, the plurality of 3D map points comprise the at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point decoded before the current 3D map point is decoded.
  15. The method of claim 14, wherein the data further comprises a 3D map point spatial location.
  16. The method according to claim 14 or 15, wherein said obtaining data of the current 3D map point from data of at least one reference 3D map point and residual data of the current 3D map point comprises:
    acquiring a predicted value of the current 3D map point according to the data of the at least one reference 3D map point;
    and adding the residual data of the current 3D map point and the obtained predicted value of the current 3D map point to obtain the data of the current 3D map point.
  17. The method according to any one of claims 14-16, further comprising:
    the at least one reference 3D map point is determined.
  18. The method of claim 17, wherein when the plurality of 3D map points are arranged in a sequential manner, the determining the at least one reference 3D map point comprises:
    a first 3D map point is determined as the at least one reference 3D map point, the first 3D map point being a last decoded 3D map point of the current 3D map point.
  19. The method of claim 17, wherein when the plurality of 3D map points are arranged in a sequential manner, the determining the at least one reference 3D map point comprises:
    Determining a first 3D map point and a second 3D map point as the at least one reference 3D map point, the first 3D map point being a last decoded 3D map point of the current 3D map point, the second 3D map point being a last decoded 3D map point of the first 3D map point.
  20. The method of claim 17, wherein the determining the at least one reference 3D map point when the plurality of 3D map points are arranged in a topology tree, comprises:
    a third 3D map point is determined as the at least one reference 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node position of the current 3D map point.
  21. The method of claim 17, wherein the determining the at least one reference 3D map point when the plurality of 3D map points are arranged in a topology tree, comprises:
    determining a third 3D map point and a fourth 3D map point as the at least one reference 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node position of the current 3D map point, the fourth 3D map point being a 3D map point located in the topology tree at a parent node position of the third 3D map point.
  22. The method according to any one of claims 14-21, wherein said obtaining data of the current 3D map point from data of at least one reference 3D map point and residual data of the current 3D map point comprises:
    acquiring a 3D map point descriptor of the current 3D map point according to residual data of a 3D map point descriptor of the current 3D map point and the at least one 3D map point descriptor of the reference 3D map point;
    and acquiring the 3D map point space position of the current 3D map point according to the residual data of the 3D map point space position of the current 3D map point and the 3D map point space position of the at least one reference 3D map point.
  23. An encoding device for a 3D map, comprising:
    the prediction module is used for acquiring a plurality of 3D map points contained in the 3D map, wherein the 3D map points have cross-modal correlation, namely the sequence of any 3D map point in the 3D map points is associated with a plurality of attributes of the 3D map points, and the attributes of the 3D map points at least comprise 3D map point descriptors; predicting data of a current 3D map point according to data of at least one reference 3D map point to obtain residual data of the current 3D map point, wherein the data at least comprises the plurality of attributes, the plurality of 3D map points comprise the current 3D map point and the at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point coded before the current 3D map point is coded;
    And the packaging module is used for encoding the residual data of the current 3D map point to obtain a code stream.
  24. The apparatus of claim 23, wherein the plurality of attributes of the 3D map points further comprise 3D map point spatial locations.
  25. The apparatus of claim 23 or 24, wherein the prediction module is further configured to rank the plurality of 3D map points according to the plurality of attributes of the plurality of 3D map points.
  26. The apparatus of any one of claims 23-25, wherein the plurality of 3D map points are arranged in a sequence, each element in the sequence corresponding to a respective one of the plurality of 3D map points; in the sequence, the plurality of attributes between two adjacent 3D map points are associated.
  27. The apparatus of any one of claims 23-25, wherein the plurality of 3D map points are arranged in a topology tree, each node in the topology tree corresponding to a respective one of the plurality of 3D map points; in the topology tree, the plurality of attributes are associated between a 3D map point located at a parent node position and a 3D map point located at a child node position.
  28. The apparatus of any one of claims 23-27, wherein the prediction module is further configured to determine the at least one reference 3D map point.
  29. The apparatus according to claim 28, wherein the prediction module is configured to determine a first 3D map point as the at least one reference 3D map point when the plurality of 3D map points are arranged in a sequence, the first 3D map point being a 3D map point arranged in the sequence in front of the current 3D map point.
  30. The apparatus according to claim 28, wherein when the plurality of 3D map points are arranged in a sequence, the prediction module is specifically configured to determine a first 3D map point and a second 3D map point as the at least one reference 3D map point, the first 3D map point being a 3D map point arranged in the sequence before the current 3D map point, the second 3D map point being a 3D map point arranged in the sequence before the first 3D map point.
  31. The apparatus according to claim 28, wherein the prediction module is configured to determine a third 3D map point as the at least one reference 3D map point when the plurality of 3D map points are arranged in a topology tree, the third 3D map point being a 3D map point located at a parent node position of the current 3D map point in the topology tree.
  32. The apparatus according to claim 28, wherein the prediction module is configured to determine, when the plurality of 3D map points are arranged in a topology tree, that each of a third 3D map point and a fourth 3D map point is a 3D map point in the topology tree that is located at a parent node position of the current 3D map point as the at least one reference 3D map point, and wherein the fourth 3D map point is a 3D map point in the topology tree that is located at a parent node position of the third 3D map point.
  33. The apparatus according to any one of claims 23-32, wherein the prediction module is configured to obtain a predicted value of the current 3D map point from the data of the at least one reference 3D map point; and subtracting the obtained predicted value of the current 3D map point from the data of the current 3D map point to obtain residual data of the current 3D map point.
  34. The apparatus according to any one of claims 24-33, wherein the prediction module is specifically configured to predict the 3D map point descriptor of the current 3D map point according to the at least one 3D map point descriptor of the reference 3D map point to obtain residual data of the 3D map point descriptor of the current 3D map point; and predicting the 3D map point space position of the current 3D map point according to the 3D map point space position of the at least one reference 3D map point to obtain residual data of the 3D map point space position of the current 3D map point.
  35. The apparatus of any of claims 23-34, wherein the code stream further comprises indication information of the at least one reference 3D map point.
  36. A decoding apparatus of a 3D map, comprising:
    the system comprises a decapsulation module, a decoding module and a decoding module, wherein the decapsulation module is used for decoding a code stream to obtain residual data of a current 3D map point, the current 3D map point belongs to a 3D map, and the 3D map comprises a plurality of 3D map points;
    a prediction module, configured to obtain data of at least one reference 3D map point according to the data of the at least one reference 3D map point and residual data of the current 3D map point, where the data includes at least a 3D map point descriptor, and the plurality of 3D map points includes the at least one reference 3D map point, and each reference 3D map point in the at least one reference 3D map point is a 3D map point decoded before decoding the current 3D map point.
  37. The apparatus of claim 36, wherein the data further comprises a 3D map point spatial location.
  38. The apparatus according to claim 36 or 37, wherein the prediction module is configured to obtain a predicted value of the current 3D map point according to the data of the at least one reference 3D map point; and adding the residual data of the current 3D map point and the obtained predicted value of the current 3D map point to obtain the data of the current 3D map point.
  39. The apparatus of any one of claims 36-38, wherein the prediction module is further configured to determine the at least one reference 3D map point.
  40. The apparatus of claim 39, wherein the prediction module is specifically configured to determine a first 3D map point as the at least one reference 3D map point when the plurality of 3D map points are arranged in a sequential manner, the first 3D map point being a last decoded 3D map point of the current 3D map point.
  41. The apparatus of claim 39, wherein the prediction module is specifically configured to determine, when the plurality of 3D map points are arranged in a sequence, a first 3D map point and a second 3D map point as the at least one reference 3D map point, the first 3D map point being a last decoded 3D map point of the current 3D map point, the second 3D map point being a last decoded 3D map point of the first 3D map point.
  42. The apparatus of claim 39, wherein the prediction module is specifically configured to determine a third 3D map point as the at least one reference 3D map point when the plurality of 3D map points are arranged in a topology tree, the third 3D map point being a 3D map point located at a parent node position of the current 3D map point in the topology tree.
  43. The apparatus of claim 39, wherein the prediction module is specifically configured to determine, when the plurality of 3D map points are arranged in a topology tree, each of a third 3D map point and a fourth 3D map point as the at least one reference 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node position of the current 3D map point, and the fourth 3D map point being a 3D map point located in the topology tree at a parent node position of the third 3D map point.
  44. The apparatus according to any one of claims 36-43, wherein the prediction module is configured to obtain a 3D map point descriptor of the current 3D map point from residual data of a 3D map point descriptor of the current 3D map point and the at least one 3D map point descriptor of the reference 3D map point; and acquiring the 3D map point space position of the current 3D map point according to the residual data of the 3D map point space position of the current 3D map point and the 3D map point space position of the at least one reference 3D map point.
  45. An encoding apparatus of a 3D map, comprising:
    one or more processors;
    A memory for storing one or more programs;
    the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-13.
  46. A decoding apparatus of a 3D map, comprising:
    one or more processors;
    a memory for storing one or more programs;
    the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 14-22.
  47. A computer readable storage medium comprising a computer program which, when executed on a computer, causes the computer to perform the method of any of claims 1-22.
  48. A computer program product, characterized in that the computer program product comprises computer program code which, when run on a computer, causes the computer to perform the method of any of claims 1-22.
  49. A 3D map coding code stream, wherein the 3D map coding code stream includes residual data of a current 3D map point, the residual data of the current 3D map point is obtained by predicting data of the current 3D map point by using data of at least one reference 3D map point, the current 3D map point and the at least one reference 3D map point respectively belong to a 3D map, and each reference 3D map point in the at least one reference 3D map point is a 3D map point encoded before encoding the current 3D map point;
    The 3D map comprises a plurality of 3D map points, the plurality of 3D map points have cross-modal correlation, which means that the order of any one 3D map point in the plurality of 3D map points is associated with a plurality of attributes thereof, the plurality of attributes of the 3D map points at least comprise 3D map point descriptors, and the data at least comprise the plurality of attributes.
  50. The code stream of claim 49, wherein the at least one reference 3D map point is taken from a plurality of ordered 3D map points.
  51. The code stream of claim 49, wherein the plurality of 3D map points are arranged in a sequence in which the plurality of attributes between two adjacent 3D map points are associated.
  52. The code stream of claim 51, wherein the at least one reference 3D map point comprises a first 3D map point, the first 3D map point being a 3D map point arranged in the sequence immediately preceding the current 3D map point.
  53. The code stream of claim 51 or 52, wherein the at least one reference 3D map point comprises a first 3D map point and a second 3D map point, the first 3D map point being a 3D map point arranged in the sequence before the current 3D map point, the second 3D map point being a 3D map point arranged in the sequence before the first 3D map point.
  54. The code stream of claim 49, wherein the plurality of 3D map points are arranged in a topology tree in which the plurality of attributes are associated between a 3D map point located at a parent node location and a 3D map point located at a child node location.
  55. The code stream of claim 54, wherein the at least one reference 3D map point comprises a third 3D map point, the third 3D map point being a 3D map point located at a parent node location of the current 3D map point in the topology tree.
  56. The code stream of claim 54 or 55, wherein the reference 3D map points include a third 3D map point and a fourth 3D map point, the third 3D map point being a 3D map point located in the topology tree at a parent node location of the current 3D map point, the fourth 3D map point being a 3D map point located in the topology tree at a parent node location of the third 3D map point.
CN202180098602.3A 2021-06-04 2021-06-04 Encoding and decoding method and device for 3D map Pending CN117397239A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/098485 WO2022252237A1 (en) 2021-06-04 2021-06-04 Methods and apparatus for encoding and decoding 3d map

Publications (1)

Publication Number Publication Date
CN117397239A true CN117397239A (en) 2024-01-12

Family

ID=84323762

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180098602.3A Pending CN117397239A (en) 2021-06-04 2021-06-04 Encoding and decoding method and device for 3D map

Country Status (2)

Country Link
CN (1) CN117397239A (en)
WO (1) WO2022252237A1 (en)

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130044957A1 (en) * 2011-08-19 2013-02-21 Qualcomm Incorporated Methods and apparatuses for encoding and/or decoding mapped features in an electronic map of a structure
JP6638267B2 (en) * 2015-09-07 2020-01-29 カシオ計算機株式会社 Geographic coordinate encoding device, method, and program, geographic coordinate decoding device, method, and program, terminal device using geographic coordinate encoding device
US10499054B2 (en) * 2017-10-12 2019-12-03 Mitsubishi Electric Research Laboratories, Inc. System and method for inter-frame predictive compression for point clouds
CA3078455A1 (en) * 2017-10-24 2019-05-02 Panasonic Intellectual Property Corporation Of America Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device
CA3101091A1 (en) * 2018-06-06 2019-12-12 Panasonic Intellectual Property Corporation Of America Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device
CN112235580A (en) * 2019-07-15 2021-01-15 华为技术有限公司 Image encoding method, decoding method, device and storage medium
CN110572655B (en) * 2019-09-30 2023-01-10 北京大学深圳研究生院 Method and equipment for encoding and decoding point cloud attribute based on neighbor weight parameter selection and transmission
CN110647609B (en) * 2019-09-17 2023-07-18 上海图趣信息科技有限公司 Visual map positioning method and system
CN111405281A (en) * 2020-03-30 2020-07-10 北京大学深圳研究生院 Point cloud attribute information encoding method, point cloud attribute information decoding method, storage medium and terminal equipment

Also Published As

Publication number Publication date
WO2022252237A1 (en) 2022-12-08

Similar Documents

Publication Publication Date Title
WO2021088498A1 (en) Virtual object display method and electronic device
WO2021088497A1 (en) Virtual object display method, global map update method, and device
WO2018054114A1 (en) Picture encoding method and terminal
WO2023051383A1 (en) Device positioning method, device and system
WO2022252237A1 (en) Methods and apparatus for encoding and decoding 3d map
WO2022252236A1 (en) Methods and apparatus for encoding and decoding 3d map
CN117413523A (en) Encoding and decoding method and device for 3D map
WO2022252234A1 (en) 3d map encoding apparatus and method
WO2022252238A1 (en) 3d map compression method and apparatus, and 3d map decompression method and apparatus
US20240104781A1 (en) 3D Map Compression Method and Apparatus, and 3D Map Decompression Method and Apparatus
WO2022252235A1 (en) Decoding apparatus and method for 3d map, and encoded code stream of 3d map
JP2024521375A (en) Method and apparatus for deriving a 3D map - Patents.com
CN116664812A (en) Visual positioning method, visual positioning system and electronic equipment
CN116594048A (en) Positioning system, positioning method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination