CN113280817A - Visual navigation based on landmarks - Google Patents

Visual navigation based on landmarks Download PDF

Info

Publication number
CN113280817A
CN113280817A CN202010652637.4A CN202010652637A CN113280817A CN 113280817 A CN113280817 A CN 113280817A CN 202010652637 A CN202010652637 A CN 202010652637A CN 113280817 A CN113280817 A CN 113280817A
Authority
CN
China
Prior art keywords
landmark
information
degree
freedom
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010652637.4A
Other languages
Chinese (zh)
Inventor
诸小熊
李军舰
姚迪狄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010652637.4A priority Critical patent/CN113280817A/en
Publication of CN113280817A publication Critical patent/CN113280817A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a visual navigation method based on landmarks, which comprises the following steps: determining landmarks in a visual scene; acquiring multi-degree-of-freedom information of the intelligent object relative to the landmark; acquiring multi-degree-of-freedom change information of the intelligent body relative to the landmark; and navigating the motion of the intelligent agent according to the multi-degree-of-freedom change information. The method comprises the steps that six-degree-of-freedom information of a landmark in a visual scene is constructed through camera image information and the position and posture information of a gyroscope of the node, and the six-degree-of-freedom information comprises three coordinate information of the landmark in the visual scene, namely transverse, longitudinal and far and near directions, and three angle information of pitching, rotating and yawing on a coordinate point; then, according to the six-degree-of-freedom information, high frame rate visual navigation with the landmark as a reference point can be realized. The invention can be used for displaying virtual articles/characters in VR/AR, can also be used in scenes such as unmanned driving, robot navigation and the like, and can realize high real-time navigation of an intelligent agent on mobile equipment with common computing capability by matching with a gyroscope.

Description

Visual navigation based on landmarks
Technical Field
The invention relates to the technical field of map navigation, in particular to a visual navigation method and device based on landmarks.
Background
With the rapid development of computer vision technology, the visual scene map construction and navigation technology based on computer vision is widely applied to scenes such as VR/AR (virtual reality/augmented reality) and automatic navigation due to the characteristics of low cost, wide application and the like.
The most common visual map scene construction scheme is a visual SLAM (simultaneous Localization and mapping, instant positioning and map construction) technology, and map information is constructed through a sensor, a visual odometer and the like and is used for judging the position information of the current intelligent agent. This solution presents several problems: first, the map construction process of SLAM is complex: the visual SLAM requires that scene information of a plurality of angles is input to a scene, and then map information is constructed through technologies such as feature extraction and matching. Secondly, the computational complexity is high, the navigation speed is slow: because map information established by the visual SLAM is more and the characteristics are richer, the navigation calculation amount based on the map is large, and real-time navigation is difficult to realize in common computing equipment, particularly mobile equipment.
Therefore, a solution for visual navigation is needed to reduce the complexity of map construction and increase the navigation speed, and to support application to a common computing device.
Disclosure of Invention
The invention aims to provide a visual navigation method based on landmarks, so as to realize instant and simple visual landmark construction and high real-time visual navigation.
In order to achieve the above object, an embodiment of the present invention provides a landmark based visual navigation method, including:
determining landmarks in a visual scene;
acquiring multi-degree-of-freedom information of the intelligent object relative to a landmark;
acquiring multi-degree-of-freedom change information of the intelligent body relative to the landmark;
and navigating the motion of the intelligent agent according to the multi-degree-of-freedom change information.
Further, the multiple degree of freedom information includes coordinate information and angle information.
Further, the information with multiple degrees of freedom is information with six degrees of freedom, and the information with six degrees of freedom comprises an abscissa, an ordinate and a depth coordinate of the intelligent body relative to the landmark in the visual scene, and a pitch angle, a yaw angle and a rotation angle of the intelligent body in a space coordinate system; the acquiring of the multi-degree-of-freedom information of the landmark relative to the intelligent agent specifically comprises the following steps:
acquiring a visual scene shot by the intelligent object camera, analyzing the visual scene, and determining the abscissa, the ordinate and the depth coordinate of the intelligent object relative to the landmark;
sensor data of the agent is acquired, and a pitch angle, a yaw angle and a rotation angle of the agent in a spatial coordinate system are determined.
Further, determining landmarks in the visual scene is specifically: a region in the visual scene that is pre-selected by the user serves as the landmark.
Further, determining landmarks in the visual scene is specifically: and identifying a salient object target in the visual scene as the landmark by using a subject identification algorithm, or detecting a specific area as the landmark by using a target detection algorithm.
Further, the method further comprises: after the multi-degree-of-freedom information is obtained, initializing an image tracking algorithm by using the multi-degree-of-freedom information, wherein the image tracking algorithm is used for obtaining the position and/or the area of the landmark in the current visual scene.
Further, the method further comprises: and judging whether the current landmark is lost, if so, stopping the motion navigation and starting a redetection step.
Further, the re-detecting step specifically comprises: and detecting the landmark by taking the last frame before the loss as a template, and if the landmark is detected, re-acquiring the multi-degree-of-freedom information of the intelligent object relative to the landmark.
Further, the area center coordinates of the landmark image are used as the abscissa and the ordinate of the landmark, so that the abscissa and the ordinate of the intelligent object relative to the landmark are obtained, and the distance between the intelligent object camera and the landmark is used as a depth coordinate; wherein, the obtaining process of the depth value is as follows: and acquiring a minimum circumcircle of the region of the landmark image, and taking the product of the radius R of the circumcircle and the prior coefficient k as the depth coordinate of the landmark, thereby acquiring the depth coordinate of the intelligent object relative to the landmark.
Further, the change of the multiple degree of freedom change information of the agent with respect to the landmark includes: information on changes in pitch, yaw and rotation angles, displacement of the agent relative to the landmark in the plane, and depth displacement of the agent relative to the landmark; wherein the displacement in the landmark plane is a variation of coordinates of the landmark in the current image frame and initial coordinates of the landmark.
Further, the depth displacement is determined according to the minimum circumscribed circle radius of the current landmark image region and the minimum circumscribed circle radius of the landmark image region when the landmark is constructed.
The embodiment of the invention also provides a visual navigation device based on the landmark, which comprises:
a landmark determination module to determine landmarks in a visual scene;
the multi-degree-of-freedom information construction module is used for acquiring the multi-degree-of-freedom information of the intelligent object relative to the landmark;
the change information acquisition module is used for acquiring the position change information of the intelligent body relative to the landmark;
and the visual navigation module is used for navigating the motion of the intelligent agent according to the position change information.
Further, the multiple degree of freedom information includes coordinate information and angle information.
Further, the information with multiple degrees of freedom is information with six degrees of freedom, and the information with six degrees of freedom comprises an abscissa, an ordinate and a depth coordinate of the intelligent body relative to the landmark in the visual scene, and a pitch angle, a yaw angle and a rotation angle of the intelligent body in a space coordinate system; the multi-degree-of-freedom information construction module is specifically used for:
acquiring a visual scene shot by the intelligent object camera, analyzing the visual scene, and determining the abscissa, the ordinate and the depth coordinate of the intelligent object relative to the landmark;
sensor data of the agent is acquired, and a pitch angle, a yaw angle and a rotation angle of the agent in a spatial coordinate system are determined.
Further, the landmark determination module is specifically configured to: a region in the visual scene that is pre-selected by the user serves as the landmark.
Further, the landmark determination module is specifically configured to: and identifying a salient object target in the visual scene as the landmark by using a subject identification algorithm, or detecting a specific area as the landmark by using a target detection algorithm.
Further, the vision multi-degree-of-freedom information construction module is further configured to: after the multi-degree-of-freedom information is obtained, initializing an image tracking algorithm by using the multi-degree-of-freedom information, wherein the image tracking algorithm is used for obtaining the position and/or the area of the landmark in the current visual scene.
Further, the visual navigation module is further configured to determine whether the current landmark is lost, stop the motion navigation if the current landmark is lost, and start the re-detection module.
Further, the re-detection module is configured to detect the landmark by using the last frame before the loss is determined as a template, and if the landmark is detected, re-acquire the multi-degree-of-freedom information of the intelligent object relative to the landmark.
Further, the area center coordinates of the landmark image are used as the abscissa and the ordinate of the landmark, so that the abscissa and the ordinate of the intelligent object relative to the landmark are obtained, and the distance between the intelligent object camera and the landmark is used as a depth coordinate; wherein, the obtaining process of the depth value is as follows: and acquiring a minimum circumcircle of the region of the landmark image, and taking the product of the radius R of the circumcircle and the prior coefficient k as the depth coordinate of the landmark, thereby acquiring the depth coordinate of the intelligent object relative to the landmark.
Further, the change of the multiple degree of freedom change information of the agent with respect to the landmark includes: information on changes in pitch, yaw and rotation angles, displacement of the agent relative to the landmark in the plane, and depth displacement of the agent relative to the landmark; wherein the displacement in the landmark plane is a variation of coordinates of the landmark in the current image frame and initial coordinates of the landmark.
Further, the depth displacement is determined according to the minimum circumscribed circle radius of the current landmark image region and the minimum circumscribed circle radius of the landmark image region when the landmark is constructed. The embodiment of the invention also provides an image acquisition method, which comprises the following steps:
determining an acquisition object in a visual scene, wherein the acquisition object is at least one salient object or a specific area in the visual scene;
acquiring an image of the object;
acquiring multi-degree-of-freedom information of the intelligent body relative to an acquisition object;
associating the image of the acquired object with the multi-degree-of-freedom information;
storing the image of the acquisition object and the associated multiple degree of freedom information.
Further, determining the acquisition object in the visual scene specifically includes: and identifying a salient object in the visual scene as the acquisition object by using an image subject identification algorithm, or detecting a specific area as the acquisition object by using a target detection algorithm.
Further, the multiple degree of freedom information includes coordinate information and angle information.
Further, the information with multiple degrees of freedom is information with six degrees of freedom, and the information with six degrees of freedom comprises an abscissa, an ordinate and a depth coordinate of the intelligent body relative to the acquisition object in the visual scene, and a pitch angle, a yaw angle and a rotation angle of the intelligent body in a space coordinate system; the acquiring of the multi-degree-of-freedom information of the acquisition object relative to the intelligent agent specifically comprises:
acquiring a visual scene shot by the intelligent object camera, analyzing the visual scene, and determining the abscissa, the ordinate and the depth coordinate of the intelligent object relative to the acquisition object;
sensor data of the agent is acquired, and a pitch angle, a yaw angle and a rotation angle of the agent in a spatial coordinate system are determined.
Further, the method further comprises:
acquiring environment attribute information when the intelligent agent collects the object image;
associating the image of the acquisition object with the environment attribute information;
storing the associated environment attribute information.
Further, the method further comprises:
acquiring multi-degree-of-freedom information and/or environment attribute information of a current intelligent body relative to a specified object;
acquiring an image of the specified object according to the multi-degree-of-freedom information and/or the environment attribute information;
presenting an image of the specified object.
Embodiments of the present invention further provide a computer program product comprising computer program instructions, which when executed by a processor, are configured to implement the aforementioned landmark based visual navigation method or the aforementioned image acquisition method.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed, the aforementioned landmark-based visual navigation method or the aforementioned image acquisition method is implemented.
The invention has the beneficial effects that: the invention provides a visual navigation method based on landmarks, which comprises the following steps: acquiring multi-degree-of-freedom information of the landmark relative to the intelligent agent; acquiring multi-degree-of-freedom change information of the landmark relative to the intelligent agent; and navigating the motion of the intelligent agent according to the multi-degree-of-freedom change information. The method comprises the steps that six-degree-of-freedom information of a landmark in a visual scene is constructed through camera image information and the position and posture information of a gyroscope of the node, and the six-degree-of-freedom information comprises three coordinate information of the landmark in the visual scene, namely transverse, longitudinal and far and near directions, and three angle information of pitching, rotating and yawing on a coordinate point; then, according to the six-degree-of-freedom information, high frame rate visual navigation with the landmark as a reference point can be realized. The invention can be used for displaying the virtual objects/characters with six degrees of freedom in VR/AR, and can also be used in scenes such as unmanned driving, robot navigation and the like. And by matching with gyroscope information, high real-time navigation of the intelligent agent can be realized on mobile equipment with common computing capability.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a method according to a first embodiment of the present invention.
Fig. 2 is a schematic diagram of landmark regions in a visual scene.
FIG. 3 is a block diagram of an apparatus according to a second embodiment of the present invention
FIG. 4 is a flowchart of a method according to a third embodiment of the present invention.
Detailed Description
To facilitate understanding and implementing the present invention for those skilled in the art, the following technical solutions of the present invention are described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Because the map image information constructed by the visual SLAM has complex image features and the algorithm complexity of the visual SLAM is high, real-time navigation on mobile equipment, particularly mobile equipment (such as a mobile phone) with general computing capability is difficult to realize.
According to the scheme, the image tracking algorithm is utilized, only the image tracking of the landmark region is needed, and the displacement and the attitude change of the intelligent object relative to the landmark can be obtained by utilizing the image coordinate system information and the gyroscope information. Most of the existing image tracking algorithms have the characteristic of high real-time performance, so that real-time image tracking can be realized on mobile terminal equipment. Therefore, the high real-time navigation of the intelligent agent can be realized on the mobile equipment with common computing capability by matching with the gyroscope information.
An agent mainly refers to a device that is mobile, equipped with a camera, a gyroscope, and a computing unit, such as a smartphone, a drone with a camera, and the like.
Example one
Referring to fig. 1, an embodiment of the present invention provides a visual navigation method based on landmarks, which includes a landmark determining step, a multiple degree of freedom information constructing step, a change information acquiring step, and a visual navigation step.
A landmark determining step for determining landmarks in the visual scene. The landmark is a mark area used for position and posture reference in the motion navigation process, and the position area in the visual scene is preset by a user, as shown in fig. 2, the user uses a vertical cabinet in the visual scene as the landmark. Of course, landmarks may also be determined by certain intelligent algorithms, such as: the most prominent object target in the visual scene is identified as a landmark using a subject recognition algorithm, or in a particular scene, a particular area (e.g., a logo) is detected as a landmark using a target detection algorithm.
And a multi-degree-of-freedom information construction step, namely acquiring the multi-degree-of-freedom information of the intelligent object relative to the landmark. After the multi-degree-of-freedom information is obtained, initializing an image tracking algorithm by using the multi-degree-of-freedom information so as to realize the area tracking of a visual image layer.
Wherein the multi-degree-of-freedom information is six-degree-of-freedom information. The six degrees of freedom include an abscissa, an ordinate, a depth coordinate of the agent with respect to the landmark in the visual scene, and a pitch angle, a yaw angle, and a rotation angle of the agent in a spatial coordinate system. The visual scene is a visual scene in an image frame shot by the intelligent camera. Attitude angle information such as a pitch angle, a yaw angle, a rotation angle, and the like can be obtained by a gyroscope in the agent.
As shown in fig. 2, the area center coordinates (x, y) of the landmark image are taken as the abscissa and the ordinate of the landmark, and the abscissa and the ordinate of the agent relative to the landmark are further acquired. And taking the distance of the intelligent agent camera relative to the landmark as a depth coordinate. Wherein, the obtaining process of the depth value is as follows: and acquiring a minimum circumcircle of the region of the landmark image, taking the product of the radius R of the circumcircle and a prior coefficient k as a depth coordinate of the landmark, and taking d as R as k, wherein k is an empirical value set according to specific application and a scene.
And the change information acquisition step is used for acquiring the multi-degree-of-freedom change information of the intelligent body relative to the landmark. The multiple degree of freedom change information includes: three azimuthal angle variation information (delta _ P, delta _ R, delta _ Y), displacement of the agent in the plane of the landmark (delta _ x, delta _ Y) and depth displacement of the agent in relation to the landmark delta _ d.
The change information of the three azimuth angles is the change amount of the attitude information of the current agent and the attitude information of the constructed landmark, taking a pitch angle as an example, and if the current gyroscope position is P1 and the pitch angle of the constructed landmark is P0, the angle delta _ P of the change of the pitch angle can be obtained as P1-P0. In the same manner, the change information of the three azimuth angles (delta _ P, delta _ R, delta _ Y) can be obtained.
For the position change, we can obtain the position and the area of the current landmark in the image through the image tracker. The displacement in the landmark plane is the amount of change between the coordinates of the landmark in the current image frame and the initial coordinates of the landmark when the landmark is constructed. Taking the horizontal axis x as an example, assuming that the horizontal axis coordinate of the landmark in the current image area is x1 and the initial position is x0, delta _ x can be obtained as x1-x 0. In the same way, displacements in the image plane (delta _ x, delta _ y) can be obtained.
For depth displacement, let the minimum circumscribed circle radius of the current landmark image region be R1, and R0 be the minimum circumscribed circle radius of the landmark image region when constructing the landmark, then delta _ d is k (R1/R0).
And the visual navigation step is used for navigating the motion of the intelligent body according to the multi-degree-of-freedom change information.
Preferably, the step of visually navigating further comprises: and judging whether the current landmark is lost by using an image tracking algorithm, if so, stopping the motion navigation and starting a redetection step. Taking kcf (kernel Correlation filter) tracking algorithm as an example, the current tracking state can be judged through the filter response value of each frame.
Preferably, the step of re-detecting specifically comprises: and detecting the landmark by taking the image of the last frame before being judged to be lost as a template, and if the landmark is detected, acquiring the information of six degrees of freedom of the landmark again.
The image tracking algorithm can be any algorithm for realizing object tracking through images, and is not limited to the KCF visual target tracking algorithm.
Example two
Referring to fig. 3, a second embodiment of the present invention provides a landmark-based visual navigation device 300, which includes a landmark determining module 301, a multiple degree of freedom information constructing module 302, a change information acquiring module 303, and a visual navigation module 304.
A landmark determining module 301 for determining landmarks in a visual scene. The landmark is a mark area used for position and attitude reference in the motion navigation process, and the position area in the visual scene is preset by a user. Of course, landmarks may also be determined by certain intelligent algorithms, such as: the most prominent object target in the visual scene is identified as a landmark using a subject recognition algorithm, or in a particular scene, a particular area (e.g., a logo) is detected as a landmark using a target detection algorithm.
A multiple degree of freedom information construction module 302, configured to obtain position information of the landmark relative to the agent. After the multi-degree-of-freedom information is obtained, initializing an image tracking algorithm by using the multi-degree-of-freedom information so as to realize the area tracking of a visual image layer. Wherein the multi-degree-of-freedom information is six-degree-of-freedom information. The six degrees of freedom include an abscissa, an ordinate, a depth coordinate of the landmark in the visual scene, and a pitch angle, a yaw angle, and a rotation angle of the agent in a spatial coordinate system; the visual scene is a visual scene in an image frame shot by the intelligent camera.
And a change information acquiring module 303 configured to acquire multi-degree-of-freedom change information of the landmark with respect to the agent.
The visual navigation module 304 is used for navigating the movement of the intelligent body according to the change information of the intelligent body relative to the landmark in six degrees of freedom.
Preferably, the apparatus further comprises a re-detection module 305. The visual navigation module 304 is further configured to determine whether the current landmark is lost through an image tracking algorithm, and if the current landmark is lost, stop the motion navigation and start the re-detection module 305.
The redetection module 305 is configured to detect the landmark by using the image of the last frame before the loss is determined as a template, and if the landmark is detected, reacquire the multi-degree-of-freedom information of the landmark.
EXAMPLE III
Referring to fig. 4, a third embodiment of the present invention provides an image acquisition method, including:
s401, an acquisition object in the visual scene is determined, wherein the acquisition object is at least one salient object or a specific area in the visual scene.
In addition to capturing specified objects, all objects in a visual scene may also be captured by the present invention. For objects, there are differences in the types of objects contained in different scenes. Such as a scene between sample panels, objects including furniture, decorations, etc.; whereas for a museum scene, the object comprises an exhibit.
The acquisition object is determined by certain intelligent algorithms, such as: the method comprises the steps of identifying a salient object in a visual scene by using an image subject identification algorithm, and using the salient object as an acquisition object, or detecting a specific area (such as a logo, furniture, an ornament and the like) as an acquisition object in a specific scene by using a target detection algorithm. As shown in fig. 2, a vertical cabinet in a visual scene is taken as an acquisition object.
S402, acquiring an image of the object. The collected images can help the user browse the scene space, such as a home decoration scene, or a museum scene, and the user can browse the whole image of the visual scene and/or the image of other objects besides browsing some specific objects in multiple angles.
And S403, acquiring the multi-degree-of-freedom information of the intelligent object relative to the acquisition object. After the multi-degree-of-freedom information is obtained, initializing an image tracking algorithm by using the multi-degree-of-freedom information so as to realize the area tracking of a visual image layer.
Wherein the multi-degree-of-freedom information is six-degree-of-freedom information. The six degrees of freedom comprise the abscissa, the ordinate and the depth coordinate of the intelligent body relative to the acquisition object, and the pitch angle, the yaw angle and the rotation angle of the intelligent body in a space coordinate system. The visual scene is a visual scene in an image frame shot by the intelligent camera. Attitude angle information such as a pitch angle, a yaw angle, a rotation angle, and the like can be obtained by a gyroscope in the agent.
As shown in fig. 2, the area center coordinates (x, y) of the landmark image are taken as the abscissa and the ordinate of the landmark, and the abscissa and the ordinate of the agent relative to the acquisition object are further acquired. And taking the distance between the intelligent body camera and the acquisition object relative to the camera as a depth coordinate. The acquisition process of the depth coordinate comprises the following steps: and acquiring a minimum circumcircle of the region of the landmark image, taking the product of the radius R of the circumcircle and a prior coefficient k as a depth coordinate of the landmark, and taking d as R as k, wherein k is an empirical value set according to specific application and a scene.
S404, associating the image of the acquired object with the multi-degree-of-freedom information. Thus, the mapping relation between the image of the acquired object and the information with multiple degrees of freedom is established.
Preferably, environment attribute information when the intelligent agent acquires the image is further acquired, and the object in the visual scene is associated with the environment attribute information. The environment information includes a photographing time, a scene type, season information of photographing, and the like.
S405, storing the image of the acquired object and the associated multi-degree-of-freedom information. Preferably, the associated environment attribute information is also stored.
Through the steps, images of objects in the visual scene relative to the intelligent agent at different angles and different distances are established. The method may also be used to present six degree of freedom virtual items/characters in the VR/AR.
Preferably, the method further comprises: acquiring multi-degree-of-freedom information and/or environment attribute information of a current intelligent body relative to a specified object; acquiring an image of the specified object according to the multi-degree-of-freedom information and/or the environment attribute information; and presenting an image of the specified object.
The method of the third embodiment of the invention can realize the collection of the images of some objects in the visual scene in the visual navigation process and establish the association relationship between the intelligent agent track information and the collected images. Based on the track information, the visual angle position can be accurately determined, the visual angle of a certain formulated object can be viewed, and the like.
Taking the image acquisition between the sample plates as an example, after the acquisition is completed by adopting the method of the third embodiment of the invention, the whole 3D image between the sample plates and the images with different visual angles of certain furniture/ornaments can be generated based on the acquired images. Other users (such as customers visiting the sample boards) can view images of different visual angles of the specified object, and can also view the whole 3D effect between the sample boards as a reference for buying a house or decorating the house. It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, modules and units may refer to the corresponding processes of the foregoing method embodiments, and are not described herein again.
The embodiment of the invention also discloses a computer program product which comprises computer program instructions and is used for realizing the method in the first embodiment or the third embodiment when the instructions are executed by a processor.
The embodiment of the invention also discloses a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed, the method of the first embodiment or the third embodiment is realized.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods, apparatus, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart and block diagrams may represent a module, segment, or portion of code, which comprises one or more computer-executable instructions for implementing the logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. It will also be noted that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the use of the phrase "comprising a. -. said" to define an element does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention, and is provided by way of illustration only and not limitation. It will be apparent to those skilled in the art from this disclosure that various other changes and modifications can be made without departing from the spirit and scope of the invention.

Claims (20)

1. A landmark-based visual navigation method, the method comprising:
determining landmarks in a visual scene;
acquiring multi-degree-of-freedom information of the intelligent object relative to the landmark;
acquiring multi-degree-of-freedom change information of the intelligent body relative to the landmark;
and navigating the motion of the intelligent agent according to the multi-degree-of-freedom change information.
2. The method of claim 1, wherein the multiple degree of freedom information comprises coordinate information and angle information.
3. The method of claim 2, wherein the multiple degree of freedom information is six degrees of freedom information including an abscissa, an ordinate, a depth coordinate of the agent with respect to the landmark in the visual scene, and a pitch angle, a yaw angle, and a rotation angle of the agent in a spatial coordinate system; the acquiring of the multi-degree-of-freedom information of the landmark relative to the intelligent agent specifically comprises the following steps:
acquiring a visual scene shot by the intelligent object camera, analyzing the visual scene, and determining the abscissa, the ordinate and the depth coordinate of the intelligent object relative to the landmark;
sensor data of the agent is acquired, and a pitch angle, a yaw angle and a rotation angle of the agent in a spatial coordinate system are determined.
4. The method of claim 1, wherein determining landmarks in a visual scene is specifically: a region in the visual scene that is pre-selected by the user serves as the landmark.
5. The method of claim 1, wherein determining landmarks in a visual scene is specifically: and identifying a salient object target in the visual scene as the landmark by using a subject identification algorithm, or detecting a specific area as the landmark by using a target detection algorithm.
6. The method of claim 1, wherein the method further comprises: after the multi-degree-of-freedom information is obtained, initializing an image tracking algorithm by using the multi-degree-of-freedom information, wherein the image tracking algorithm is used for obtaining the position and/or the area of the landmark in the current visual scene.
7. The method of claim 1, wherein the method further comprises: and judging whether the current landmark is lost, if so, stopping the motion navigation and starting a redetection step.
8. The method according to claim 7, wherein the re-detection step is specifically: and detecting the landmark by taking the last frame before the loss as a template, and if the landmark is detected, re-acquiring the multi-degree-of-freedom information of the intelligent object relative to the landmark.
9. The method of claim 3, wherein coordinates of a center of an area of the landmark image are used as an abscissa and an ordinate of the landmark, and then the abscissa and the ordinate of the agent relative to the landmark are obtained, and a distance of the agent camera relative to the landmark is used as a depth coordinate; the acquisition process of the depth coordinate comprises the following steps: and acquiring a minimum circumcircle of the region of the landmark image, and taking the product of the radius R of the circumcircle and the prior coefficient k as the depth coordinate of the landmark, thereby acquiring the depth coordinate of the intelligent object relative to the landmark.
10. The method of claim 3, wherein the changing of the multiple degree of freedom change information of the agent with respect to the landmark comprises: information on changes in pitch, yaw and rotation angles, displacement of the agent relative to the landmark in the plane, and depth displacement of the agent relative to the landmark; wherein the displacement in the landmark plane is a variation of coordinates of the landmark in the current image frame and initial coordinates of the landmark.
11. The method of claim 10, wherein the depth displacement is determined based on a minimum circumscribed circle radius of the current landmark image region and a minimum circumscribed circle radius of the landmark image region when constructing the landmark.
12. A landmark based visual navigation device, comprising:
a landmark determination module to determine landmarks in a visual scene;
the multi-degree-of-freedom information construction module is used for acquiring multi-degree-of-freedom information of the intelligent object relative to the landmark;
the change information acquisition module is used for acquiring the position change information of the intelligent body relative to the landmark;
and the visual navigation module is used for navigating the motion of the intelligent agent according to the position change information.
13. A method of image acquisition, the method comprising:
determining an acquisition object in a visual scene, wherein the acquisition object is at least one salient object or a specific area in the visual scene;
acquiring an image of the object;
acquiring multi-degree-of-freedom information of an intelligent object relative to an acquisition object;
associating the image of the acquired object with the multi-degree-of-freedom information;
storing the image of the acquisition object and the associated multiple degree of freedom information.
14. The method according to claim 13, wherein determining acquisition objects in the visual scene is in particular: and identifying a salient object in the visual scene as the acquisition object by using an image subject identification algorithm, or detecting a specific area as the acquisition object by using a target detection algorithm.
15. The method of claim 13, wherein the multiple degree of freedom information includes coordinate information and angle information.
16. The method of claim 15, wherein the multiple degree of freedom information is six degrees of freedom information comprising an abscissa, an ordinate, a depth coordinate of the agent with respect to the acquisition object in the visual scene, and a pitch angle, a yaw angle, and a rotation angle of the agent in a spatial coordinate system; the acquiring of the multi-degree-of-freedom information of the acquisition object relative to the intelligent agent specifically comprises:
acquiring a visual scene shot by the intelligent object camera, analyzing the visual scene, and determining the abscissa, the ordinate and the depth coordinate of the intelligent object relative to the acquisition object;
sensor data of the agent is acquired, and a pitch angle, a yaw angle and a rotation angle of the agent in a spatial coordinate system are determined.
17. The method of claim 13, wherein the method further comprises:
acquiring environment attribute information when the intelligent agent collects the object image;
associating the image of the object with the environmental attribute information;
storing the associated environment attribute information.
18. The method of claim 13, wherein the method further comprises:
acquiring multi-degree-of-freedom information and/or environment attribute information of a current intelligent body relative to a specified object;
acquiring an image of the specified object according to the multi-degree-of-freedom information and/or the environment attribute information;
presenting an image of the specified object.
19. A computer program product comprising computer program instructions for implementing the visual navigation method of any one of claims 1-11 or the image acquisition method of any one of claims 13-18 when said instructions are executed by a processor.
20. A computer-readable storage medium having stored thereon a computer program which, when executed, implements the visual navigation method of any one of claims 1-11 or the image acquisition method of any one of claims 13-18.
CN202010652637.4A 2020-07-08 2020-07-08 Visual navigation based on landmarks Pending CN113280817A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010652637.4A CN113280817A (en) 2020-07-08 2020-07-08 Visual navigation based on landmarks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010652637.4A CN113280817A (en) 2020-07-08 2020-07-08 Visual navigation based on landmarks

Publications (1)

Publication Number Publication Date
CN113280817A true CN113280817A (en) 2021-08-20

Family

ID=77275622

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010652637.4A Pending CN113280817A (en) 2020-07-08 2020-07-08 Visual navigation based on landmarks

Country Status (1)

Country Link
CN (1) CN113280817A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1024347A1 (en) * 1999-01-28 2000-08-02 International Business Machines Corporation Method and device for navigation
JP2004030445A (en) * 2002-06-27 2004-01-29 National Institute Of Advanced Industrial & Technology Method, system, and program for estimating self-position of moving robot
WO2006109527A1 (en) * 2005-03-30 2006-10-19 National University Corporation Kumamoto University Navigation device and navigation method
CN105241445A (en) * 2015-10-20 2016-01-13 深圳大学 Method and system for acquiring indoor navigation data based on intelligent mobile terminal
CN105910615A (en) * 2016-03-30 2016-08-31 宁波元鼎电子科技有限公司 Navigation method and system for walking based on virtual reality
US20170116783A1 (en) * 2015-10-26 2017-04-27 Institute Of Nuclear Energy Research Atomic Energy Council, Executive Yuan Navigation System Applying Augmented Reality
CN111197984A (en) * 2020-01-15 2020-05-26 重庆邮电大学 Vision-inertial motion estimation method based on environmental constraint

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1024347A1 (en) * 1999-01-28 2000-08-02 International Business Machines Corporation Method and device for navigation
JP2004030445A (en) * 2002-06-27 2004-01-29 National Institute Of Advanced Industrial & Technology Method, system, and program for estimating self-position of moving robot
WO2006109527A1 (en) * 2005-03-30 2006-10-19 National University Corporation Kumamoto University Navigation device and navigation method
CN105241445A (en) * 2015-10-20 2016-01-13 深圳大学 Method and system for acquiring indoor navigation data based on intelligent mobile terminal
US20170116783A1 (en) * 2015-10-26 2017-04-27 Institute Of Nuclear Energy Research Atomic Energy Council, Executive Yuan Navigation System Applying Augmented Reality
CN105910615A (en) * 2016-03-30 2016-08-31 宁波元鼎电子科技有限公司 Navigation method and system for walking based on virtual reality
CN111197984A (en) * 2020-01-15 2020-05-26 重庆邮电大学 Vision-inertial motion estimation method based on environmental constraint

Similar Documents

Publication Publication Date Title
CN110702111B (en) Simultaneous localization and map creation (SLAM) using dual event cameras
CN106092104B (en) A kind of method for relocating and device of Indoor Robot
CN109506658B (en) Robot autonomous positioning method and system
CN109345588B (en) Tag-based six-degree-of-freedom attitude estimation method
CN108406731B (en) Positioning device, method and robot based on depth vision
CN108489482B (en) The realization method and system of vision inertia odometer
JP6768156B2 (en) Virtually enhanced visual simultaneous positioning and mapping systems and methods
CN111275763B (en) Closed loop detection system, multi-sensor fusion SLAM system and robot
Ribo et al. Hybrid tracking for outdoor augmented reality applications
US7616807B2 (en) System and method for using texture landmarks for improved markerless tracking in augmented reality applications
CN109084732A (en) Positioning and air navigation aid, device and processing equipment
Gao et al. Robust RGB-D simultaneous localization and mapping using planar point features
US20130335529A1 (en) Camera pose estimation apparatus and method for augmented reality imaging
CN109791608A (en) Mapping abstract and localization
JP6609640B2 (en) Managing feature data for environment mapping on electronic devices
CN111735439A (en) Map construction method, map construction device and computer-readable storage medium
Valiente García et al. Visual Odometry through Appearance‐and Feature‐Based Method with Omnidirectional Images
CN110749308B (en) SLAM-oriented outdoor positioning method using consumer-grade GPS and 2.5D building models
Munguía et al. Monocular SLAM for visual odometry: A full approach to the delayed inverse‐depth feature initialization method
US10388069B2 (en) Methods and systems for light field augmented reality/virtual reality on mobile devices
Xian et al. Fusing stereo camera and low-cost inertial measurement unit for autonomous navigation in a tightly-coupled approach
Huttunen et al. A monocular camera gyroscope
CN110728684B (en) Map construction method and device, storage medium and electronic equipment
CN112200917A (en) High-precision augmented reality method and system
CN112731503A (en) Pose estimation method and system based on front-end tight coupling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination