CN114594768B - Mobile robot navigation decision-making method based on visual feature map reconstruction - Google Patents
Mobile robot navigation decision-making method based on visual feature map reconstruction Download PDFInfo
- Publication number
- CN114594768B CN114594768B CN202210207094.4A CN202210207094A CN114594768B CN 114594768 B CN114594768 B CN 114594768B CN 202210207094 A CN202210207094 A CN 202210207094A CN 114594768 B CN114594768 B CN 114594768B
- Authority
- CN
- China
- Prior art keywords
- navigation
- mobile robot
- visual
- feature map
- decision
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0231—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
- G05D1/0246—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means
- G05D1/0253—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means extracting relative motion information from a plurality of images taken successively, e.g. visual odometry, optical flow
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0214—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory in accordance with safety or protection criteria, e.g. avoiding hazardous areas
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0221—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0276—Control of position or course in two dimensions specially adapted to land vehicles using signals provided by a source external to the vehicle
Abstract
The invention discloses a mobile robot navigation decision-making method based on visual feature map reconstruction, which comprises the following specific processes: at any time of the mobile robot navigation, generating a visual panoramic image with the mobile robot as the center based on an OpenPano algorithm for four images in different directions acquired by four monocular visual sensors carried on the mobile robot; based on the relevance measurement of the visual patches, extracting visual features related to a navigation target of the mobile robot in the visual panorama, and reconstructing a mobile robot visual image feature map guided by the navigation target; and constructing a mobile robot navigation decision model based on the reconstructed mobile robot visual image characteristic diagram, issuing a motion command required by the mobile robot to navigate to a target, and realizing intelligent navigation of the mobile robot driven by the navigation target. The invention improves the generalization capability of the mobile robot navigation to the complex and changeable environment and improves the efficiency of the mobile robot navigation.
Description
Technical Field
The invention relates to the technical field of mobile robot navigation, in particular to a mobile robot navigation decision method based on visual feature map reconstruction.
Background
With the rapid development of sensors and artificial intelligence technologies, mobile robots are gradually developing toward practicality, serialization and intelligence. The basis of realizing the intellectualization and autonomy of the mobile robot is that the safe and accurate navigation can be realized, and the intelligent navigation technology has important use value in various intelligent activities of the mobile robot. Therefore, the smart navigation technology is one of the cores of the research of the mobile robot. The intelligent navigation technology of the mobile robot is researched, so that the mobile robot has the capabilities of effective exploration, efficient calculation, autonomous decision making, quick response and the like for unknown environments, and has wide application prospects in various fields of industrial manufacturing, ocean exploration, family service, medical care, resource development, inspection and risk elimination, aerospace, national defense and the like.
The traditional mobile robot navigation method mainly comprises two parts: firstly, a map is synchronously constructed on line or off line by utilizing an SLAM technology, a depth sensor, a stereo camera or a monocular camera based on structure motion; then, a collision-free trajectory is calculated based on the constructed map to reach the target point. However, in a complex and variable environment, due to occlusion or data noise, such as: the map established in the first stage cannot provide reliable information due to the position changes of the walking human, other mobile robots and various devices; moreover, the frequent re-establishment of the scene map requires a lot of manpower and time, which also limits the popularization and application of the mobile robot in complicated and variable indoor scenes. Recently, the success of the machine learning strategy based on data driving in various control and perception problems opens up a new way for overcoming the limitations of the traditional method. However, the intelligent navigation mode based on the learning strategy mostly realizes the navigation task based on a large amount of navigation experiences in similar environments, and has low data efficiency and poor generalization; most of the current test data come from a simulation synthetic scene, the navigation effect is poor in a complex real scene, the current test data cannot adapt to the change of the positions of various objects in a real environment, and therefore the popularization and the use of the mobile robot are limited.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a mobile robot navigation decision method based on visual feature map reconstruction.
In order to achieve the technical purpose, the invention adopts the following technical scheme: a mobile robot navigation decision-making method based on visual feature map reconstruction comprises the following steps:
(1) at any time t of the mobile robot navigation, generating a visual panoramic image with the mobile robot as the center based on an OpenPano algorithm for four images in different directions acquired by four monocular visual sensors carried on the mobile robot;
(2) based on the relevance measurement of the visual patches, extracting visual features related to a navigation target of the mobile robot in the visual panorama in the step (1), and reconstructing a mobile robot visual image feature map guided by the navigation target;
(3) and constructing a mobile robot navigation decision model based on the depth reinforcement learning based on the reconstructed mobile robot visual image characteristic diagram, and issuing a motion command required by the mobile robot to navigate to the target.
Further, the four monocular vision sensors mounted on the mobile robot in the step (1) are arranged at the same height, and the shooting angle between any two adjacent monocular vision sensors is 90 degrees.
Further, the step (2) comprises the following sub-steps:
(2.1) designing a feature map extraction framework of the current visual panorama based on the VGG19 network;
(2.2) extracting a feature map of a navigation target image and a feature map of a current visual image of the robot through a feature map extraction framework, and measuring the correlation between a piece on the current visual feature map of the robot and a piece on the feature map of the navigation target image;
and (2.3) constructing a mobile robot visual image feature map patch guided by a navigation target based on the measurement result of the correlation, and reconstructing a mobile robot visual image feature map.
Further, the structure of the VGG19 network in step (2.1) is specifically: the method is characterized in that 16 convolutional layers are provided, the number of channels of each convolutional layer is 64, 128, 256, 512 and 512, the size of a convolutional core of each convolutional layer is 3 multiplied by 3, and the step size is 1.
Further, the step (2.2) comprises the following specific processes: extracting navigation target image I through characteristic map extraction framework g Navigation target image feature map Ψ (I) g ) And a visual panorama I of the robot at the current moment t The robot current visual image feature map Ψ (I) t ) Respectively extracting a characteristic map psi (I) of the navigation target image g ) Dough sheet onAnd a robot current visual image feature map psi (I) t ) Dough sheet onEnabling the robot to obtain a current visual image feature map Ψ (I) by maximizing the correlation between feature patches t ) Each characteristic patch Ψ i (I t ) Matching to the characteristic map Ψ (I) of the navigation target image g ) The most relevant one of the feature patches Ψ * (I g ) The method comprises the following steps:
wherein: denotes a definitional symbol, n g Image feature map psi (I) representing navigation target g ) The number of patches above, j represents Ψ (I) g ) The index of the upper feature patch,characteristic diagram psi (I) representing current visual image of robot t ) Number of patches on, I denotes Ψ (I) t ) The index of the upper feature patch,<Ψ i (I t ),Ψ j (I g )>characteristic diagram psi (I) representing current visual image of robot t ) Ith patch Ψ i (I t ) With the navigation target image feature map Ψ (I) g ) Inner product of jth patch on, | | | Ψ i (I t ) I means Ψ i (I t ) Modulo, | | Ψ j (I g ) | | denotes Ψ j (I g ) Die (2).
Further, constructing a mobile robot visual image feature map patch Ψ guided by the navigation target in the step (2.3) i (I t ,I g ) The process of (2) is as follows:
therein, Ψ i (I t ) Characteristic diagram psi (I) representing current visual image of robot t ) The ith patch, Ψ * (I g ) Image characteristic diagram psi (I) representing navigation target g ) Upper and Ψ i (I t ) The most relevant one of the feature patches,<Ψ i (I t ),Ψ * (I g )>denotes Ψ i (I t ) To Ψ * (I g ) Inner product of, | | Ψ i (I t ) I means Ψ i (I t ) Modulo, | | Ψ * (I g ) I means Ψ * (I g ) Die (2).
Further, the step (3) comprises the following sub-steps:
(3.1) determining an output decision action space of the navigation decision model and a reward function in the process of training the decision model;
(3.2) taking the reconstructed visual image feature map of the mobile robot as the front end of a navigation decision-making model, and establishing the navigation decision-making model based on a depth reinforcement learning framework A3C;
(3.3) training the navigation decision model based on the deep reinforcement learning framework A3C until the reward function obtained by the mobile robot in each round in the training process is not increased any more, stopping training to obtain the trained navigation decision model, and issuing a motion command required by the mobile robot to navigate to a target on the trained navigation decision model to realize intelligent navigation of the mobile robot.
Further, the step (3.1) is realized as follows:
(a) establishing a robot body coordinate system with the center of a mobile robot chassis as an origin, wherein the direction vertical to the chassis is a Z-axis forward direction, the direction right in front of the mobile robot is an X-axis forward direction, and the Y-axis forward direction is determined based on a right-hand rule;
(b) determining an output decision action space A of the navigation decision model as { forward translation, backward translation, leftward translation, rightward translation, leftward rotation, rightward rotation and stopping }, and specifying that the distance of each translation motion in the decision action space A is 0.5 m, and the angle of each rotation is 30 degrees;
(c) determining a reward function r in training a decision model t =r nav +r g +r c Wherein r is nav Reward, r, representing effective navigation of the mobile robot nav =Geo(I t-1 ,I g )-Geo(I t ,I g ),Geo(I t ,I g ) Geodesic distance, Geo (I), representing the current time of the mobile robot to the location of the target image t-1 ,I g ) The geodesic distance between the robot and the position of the target image at the previous moment is represented; r is g A reward indicating that the mobile robot completed the navigation task, r when the mobile robot completed the navigation task g Not greater than 10.0, otherwise r g =0;r c The penalty of collision in the navigation process of the mobile robot is shown, and when the mobile robot collides in the navigation process, r c =-2.0 otherwise, r c =0。
Further, the navigation decision model based on the deep reinforcement learning framework A3C is composed of 5 convolutional layers, 3 first fully-connected layers and 1 second fully-connected layer which are connected in sequence; the number of channels of the convolutional layers is 512, 256, 128 and 128 respectively, the size of a convolutional core of each convolutional layer is 3 multiplied by 3, and the step length is 1; the sizes of the first fully-connected layer are 512, 128 and 64 respectively, the second fully-connected layer is provided with two branches, one branch is used for outputting 7-dimensional decision commands in a decision action space, and the other branch is used for outputting 1-dimensional decision evaluation values.
Further, the step (3.3) is realized as follows: randomly initializing parameters of a navigation decision model based on a deep reinforcement learning framework A3C, updating the parameters of the navigation decision model by using a random gradient descent method SGD based on a loss function A3C, setting a learning rate to be 1e-4 and setting momentum to be 0.9; dividing 6 navigation tasks into 6 threads at random for execution in the same time period, and starting to calculate a loss function and a gradient and update navigation decision model parameters only after 10 navigation decisions are executed on the 6 navigation tasks each time; and iterating the process until the reward function obtained by the mobile robot in each round is not increased any more, completing the on-line training process to obtain a trained navigation decision model, and issuing a motion command required by the mobile robot to navigate to a target on the trained navigation decision model to realize the intelligent navigation of the mobile robot.
Compared with the prior art, the invention has the following beneficial effects: the mobile robot navigation decision method realizes that the mobile robot can autonomously navigate to the position of the target image under the condition of no environment map in a complex and changeable environment based on the navigation decision model of deep reinforcement learning, reduces manual intervention compared with the traditional navigation mode, and improves the navigation efficiency of the mobile robot; in addition, based on a visual panorama acquired by the mobile robot, the visual image feature map is reconstructed by combining with the navigation target image, the visual features closely related to the navigation target in the visual image feature map of the mobile robot are enhanced, the final navigation decision is guided, and the adaptability of the mobile robot navigation to the complex and changeable environment is improved. The mobile robot navigation decision method based on visual feature map reconstruction enables the mobile robot to navigate to the position near the target image without building an environment map, and improves adaptability to complex and variable environments and navigation efficiency.
Drawings
FIG. 1 is a flow chart of a navigation decision method based on visual feature map reconstruction in accordance with the present invention;
fig. 2 is a view showing four visual images and a generated visual panorama acquired by four monocular visual sensors mounted on the mobile robot in the present invention: a in fig. 2 is four visual images, and b in fig. 2 is a generated visual panorama;
FIG. 3 is a block diagram of a navigation decision model according to the present invention.
Detailed Description
The technical solution of the present invention is further explained below with reference to the accompanying drawings.
Compared with the traditional navigation method, the mobile robot navigation decision-making method based on visual feature map reconstruction can adapt to completely unknown scenes and the change of the positions of various objects in the environment, and improves the generalization capability of the mobile robot navigation to the complex and changeable environment, thereby further improving the navigation efficiency of the mobile robot. Fig. 1 shows a flow chart of a mobile robot navigation decision method of the present invention, which comprises the following steps:
(1) at any time t of the mobile robot navigation, generating a visual panoramic image taking the mobile robot as a center based on an OpenPano algorithm for four images in different directions acquired by four monocular visual sensors carried on the mobile robot so as to realize relatively continuous observation of the mobile robot on a local environment; the four monocular vision sensors carried on the mobile robot are arranged at the same height, and the shooting angle between any two adjacent monocular vision sensors is 90 degrees, so that the information of the surrounding environment taking the mobile robot as the center is obtained as comprehensively as possible, the final navigation decision is assisted, and the accuracy of the navigation decision of the mobile robot is improved. As a in fig. 2 is four visual images and b in fig. 2 is a generated visual panorama, it can be seen that: and (4) splicing the four visual images in the step (a) in the step (2) to obtain a panoramic image with the step (b) in the step (2) centered on the mobile robot.
(2) Based on the relevance measurement of the visual patches, extracting visual features related to a navigation target of the mobile robot in the visual panorama in the step (1), and reconstructing a mobile robot visual image feature map guided by the navigation target; the method specifically comprises the following substeps:
(2.1) designing a feature map extraction framework of the current visual panorama based on the VGG19 network; the structure of the VGG19 network is specifically as follows: the number of channels of the convolutional layers is 64, 128, 256, 512 and 512, the size of a convolutional core of each convolutional layer is 3 multiplied by 3, the step size is 1, and tensors of adjacent convolutional layers are guaranteed to keep the same width and height.
(2.2) extracting a feature map of a navigation target image and a feature map of a current visual image of the robot through a feature map extraction framework, and measuring the correlation between a piece on the current visual feature map of the robot and a piece on the feature map of the navigation target image; specifically, a navigation target image I is extracted through a feature map extraction framework g Navigation target image feature map Ψ (I) g ) And a visual panorama I of the robot at the current moment t The robot current visual image feature map Ψ (I) t ) Respectively extracting a characteristic map Ψ (I) of the navigation target image g ) Dough sheet onAnd a robot current visual image feature map psi (I) t ) Dough sheet onEnabling the robot to obtain a current visual image feature map Ψ (I) by maximizing the correlation between feature patches t ) Each of the feature patches Ψ i (I t ) Matching to the characteristic map Ψ (I) of the navigation target image g ) Is most relevant at the upperA characteristic patch Ψ of * (I g ) The correlation metric formula is:
wherein: denotes a definitional symbol, n g Image characteristic diagram psi (I) representing navigation target g ) The number of patches on, j representing Ψ (I) g ) The index of the upper feature patch,characteristic diagram psi (I) representing current visual image of robot t ) Number of patches on, I denotes Ψ (I) t ) The index of the upper feature patch,<Ψ i (I t ),Ψ j (I g )>characteristic diagram psi (I) representing current visual image of robot t ) Ith patch Ψ i (I t ) With the navigation target image feature map Ψ (I) g ) Inner product of the jth patch of above, | | Ψ i (I t ) I means Ψ u (I t ) Modulo, | | Ψ j (I g ) I means Ψ j (I g ) Die (2). In order to enhance the characteristics of the mobile robot visual image which are closely related to the navigation target, the correlation between the two tensors can be measured by adopting the correlation measurement formula, so that the correlation measurement formula is used for measuring the correlation between the visual image characteristic diagram characteristic patch and any one navigation target image characteristic diagram characteristic patch and finding the most relevant navigation target image characteristic diagram characteristic patch. By reconstructing the visual image feature map through the method, the visual features closely related to the navigation target in the visual image of the mobile robot can be enhanced, the final navigation decision is guided, and the adaptability of the mobile robot navigation to the complex and changeable environment is improved.
(2.3) constructing a visual image feature picture of the mobile robot guided by the navigation target based on the measurement result of the correlationReconfiguring mobile robot visionAn image feature map; finally reconstructed visual image characteristic map and target image I g Closely related features are enhanced.
Therein, Ψ i (I t ) Characteristic diagram psi (I) representing current visual image of robot t ) The ith upper patch, Ψ * (I g ) Image characteristic diagram psi (I) representing navigation target g ) Upper and psi i (I t ) The most relevant one of the feature patches,<Ψ i (I t ),Ψ * (I g )>denotes Ψ i (I t ) With Ψ * (I g ) Inner product of, | | Ψ i (I t ) I means Ψ i (I t ) Mode, | | Ψ * (I g ) I means Ψ * (I g ) The die of (1).
(3) Based on the reconstructed visual image characteristic diagram of the mobile robot, a mobile robot navigation decision model based on deep reinforcement learning is constructed, a motion command required by the mobile robot to navigate to a target is issued, and intelligent navigation of the mobile robot driven by the navigation target in a complex indoor environment is realized; the method specifically comprises the following substeps:
(3.1) determining an output decision action space of the navigation decision model and a reward function in the process of training the decision model; the specific implementation process is as follows:
(a) establishing a robot body coordinate system with the center of a mobile robot chassis as an origin, wherein the direction vertical to the chassis is a Z-axis forward direction, the direction right in front of the mobile robot is an X-axis forward direction, and the Y-axis forward direction is determined based on a right-hand rule;
(b) determining an output decision action space A of the navigation decision model as { forward translation, backward translation, leftward translation, rightward translation, leftward rotation, rightward rotation, and stopping }, and specifying that the distance of each translation motion in the decision action space A is 0.5 m, and the angle of each rotation is 30 degrees; the safety of the mobile robot is difficult to ensure because the distance of the translational motion is too large or the rotation angle is too large; the distance of the translational motion is too large or the rotation angle is too small, or the navigation time is prolonged.
(c) Determining a reward function r in training a decision model t =r nav +r g +r c Wherein r is nav Reward for representing effective navigation of mobile robot when robot is far from target image I g The more recent, the positive value of the reward; the farther from the goal, the negative value of the reward, r nav =Geo(I t-1 ,I g )-Geo(I t ,I g ),Geo(I t ,I g ) Geodesic distance, Geo (I), representing the current time of the mobile robot to the location of the target image t-1 ,I g ) A geodesic distance, r, representing the distance from the robot from the previous moment to the position of the target image nav The arrangement of the device can ensure the effective movement of the mobile robot; r is g A reward indicating that the mobile robot completed the navigation task, r when the mobile robot completed the navigation task g Not greater than 10.0, otherwise g =0,r g The setting of the target image is mainly to train the mobile robot to reach the position of the target image by using as little time as possible; r is c The penalty of collision in the navigation process of the mobile robot is shown, and when the mobile robot collides in the navigation process, r c -2.0, otherwise, r c =0,r c The arrangement of (2) is to reduce the collision of the robot with the environment as much as possible in the navigation process. The determined reward function can improve the accuracy of a mobile robot navigation decision model, and further improve the success rate and efficiency of navigation.
(3.2) taking the reconstructed visual image feature map of the mobile robot as the front end of a navigation decision model, and establishing the navigation decision model based on a depth reinforcement learning framework A3C, wherein the navigation decision model based on the depth reinforcement learning framework A3C consists of 5 convolutional layers, 3 first full-connection layers and 1 second full-connection layer which are connected in sequence, as shown in the frame schematic diagram of the navigation decision model in the invention, the reconstructed visual image feature map of the mobile robot is taken as the front end of the navigation decision model; the number of channels of the convolutional layers is 512, 256, 128 and 128 respectively, the size of a convolutional core of each convolutional layer is 3 multiplied by 3, and the step length is 1; the sizes of the first full connection layer are 512, 128 and 64 respectively, and the second full connection layer is provided with two branches, wherein one branch is used for outputting decision commands with 7 dimensions in a decision action space, and the other branch is used for outputting decision evaluation values.
(3.3) training the navigation decision model based on the deep reinforcement learning framework A3C until the reward function obtained by the mobile robot in each round in the training process is not increased any more, stopping training to obtain the trained navigation decision model, and issuing a motion command required by the mobile robot to navigate to a target on the trained navigation decision model to realize intelligent navigation of the mobile robot; the specific implementation process is as follows: randomly initializing parameters of a navigation decision model based on a deep reinforcement learning framework A3C, updating the parameters of the navigation decision model by using a random gradient descent method SGD based on a loss function A3C, setting a learning rate to be 1e-4 and setting momentum to be 0.9; randomly dividing 6 navigation tasks into 6 threads to execute in the same time period, and only after the 6 navigation tasks execute 10 navigation decisions, starting to calculate loss functions and gradients and updating navigation decision model parameters; and iterating the process until the reward function obtained by the mobile robot in each round is not increased any more, completing the on-line training process to obtain a trained navigation decision model, and issuing a motion command required by the mobile robot to navigate to a target on the trained navigation decision model to realize the intelligent navigation of the mobile robot.
The mobile robot is arranged in a complex space scene, and a navigation target point is arranged, so that the mobile robot carries out intelligent navigation by adopting the navigation decision method based on visual characteristic diagram reconstruction, and the mobile robot can avoid environmental obstacles and successfully reach the target point. The mobile robot navigation decision method based on visual feature map reconstruction enables the mobile robot to navigate to the position near the target image without building an environment map, and improves adaptability to complex and variable environments and navigation efficiency.
The above are only preferred embodiments of the present invention, and the scope of the present invention is not limited to the above examples, and all technical solutions that fall under the spirit of the present invention belong to the scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.
Claims (8)
1. A mobile robot navigation decision-making method based on visual feature map reconstruction is characterized by comprising the following steps:
(1) at any time t of the mobile robot navigation, generating a visual panorama centering on the mobile robot based on an OpenPano algorithm for four images in different directions acquired by four monocular visual sensors carried on the mobile robot;
(2) based on the relevance measurement of the visual patches, extracting visual features related to a navigation target of the mobile robot in the visual panorama in the step (1), and reconstructing a mobile robot visual image feature map guided by the navigation target; the method comprises the following substeps:
(2.1) designing a feature map extraction framework of the current visual panorama based on the VGG19 network;
(2.2) extracting a navigation target image feature map and a robot current visual image feature map through a feature map extraction frame, and measuring the correlation between a piece on the robot current visual feature map and a piece on the navigation target image feature map;
(2.3) constructing a mobile robot visual image feature map patch guided by a navigation target based on the measurement result of the correlation, and reconstructing a mobile robot visual image feature map;
(3) based on the reconstructed visual image characteristic diagram of the mobile robot, a mobile robot navigation decision model based on deep reinforcement learning is constructed, and a motion command required by the mobile robot to navigate to a target is issued, wherein the method comprises the following substeps:
(3.1) determining an output decision action space of the navigation decision model and a reward function in the process of training the decision model;
(3.2) taking the reconstructed visual image feature map of the mobile robot as the front end of a navigation decision-making model, and establishing the navigation decision-making model based on a depth reinforcement learning framework A3C;
(3.3) training the navigation decision model based on the deep reinforcement learning framework A3C until the reward function obtained by the mobile robot in each round in the training process is not increased any more, stopping training to obtain the trained navigation decision model, and issuing a motion command required by the mobile robot to navigate to a target on the trained navigation decision model to realize intelligent navigation of the mobile robot.
2. The mobile robot navigation decision method based on visual feature map reconstruction as claimed in claim 1, wherein four monocular visual sensors mounted on the mobile robot in step (1) are arranged at the same height, and a shooting angle between any two adjacent monocular visual sensors is 90 degrees.
3. The visual feature map reconstruction-based mobile robot navigation decision method according to claim 1, wherein the VGG19 network structure in step (2.1) is specifically: 16 convolutional layers are provided, the number of channels of the convolutional layers is 64, 128, 256, 512, and 512, the convolutional kernel size of each convolutional layer is 3 x 3, and the step size is 1.
4. The mobile robot navigation decision method based on visual feature map reconstruction as claimed in claim 1, wherein the step (2.2) is implemented by the following specific processes: extracting navigation target image I through characteristic map extraction framework g Navigation target image feature map Ψ (I) g ) And a visual panorama I of the robot at the current moment t The robot current visual image feature map psi (I) t ) Respectively extracting a characteristic map Ψ (I) of the navigation target image g ) Dough sheet onAnd a robot current visual image feature map psi (I) t ) Dough sheet onEnabling the robot to obtain a current visual image feature map Ψ (I) by maximizing the correlation between feature patches t ) Each of the feature patches Ψ i (I t ) Matching to the characteristic map Ψ (I) of the navigation target image g ) Most relevant in the aboveOf a characteristic patch Ψ * (I g ) The method comprises the following steps:
wherein the content of the first and second substances,representing a definitional symbol, n g Image characteristic diagram psi (I) representing navigation target g ) The number of patches on, j representing Ψ (I) g ) The index of the upper feature patch,characteristic diagram psi (I) representing current visual image of robot t ) Number of patches on, I denotes Ψ (I) t ) Index of the above feature patch, < Ψ i (I t ),Ψ j (I g ) Indicating the robot's current visual image feature map Ψ (I) t ) Ith patch Ψ i (I t ) With the navigation target image feature map Ψ (I) g ) Inner product of the jth patch of above, | | Ψ i (I t ) I means Ψ i (I t ) Modulo, | | Ψ j (I g ) I means Ψ j (I g ) Die (2).
5. The visual feature map reconstruction-based mobile robot navigation decision method as claimed in claim 1, wherein in step (2.3), the mobile robot visual image feature map patch Ψ guided by the navigation target is constructed i (I t ,I g ) The process of (2) is as follows:
therein, Ψ i (I t ) Characteristic diagram psi (I) representing current visual image of robot t ) The ith upper patch, Ψ * (I g ) Representing navigation target image featuresSymbol diagram psi (I) g ) Upper and psi i (I t ) One of the most relevant feature patches, < Ψ i (I t ),Ψ * (I g ) (> means Ψ) i (I t ) To Ψ * (I g ) Inner product of, | | Ψ i (I t ) I means Ψ i (I t ) Mode, | | Ψ * (I g ) I means Ψ * (I g ) The die of (1).
6. The visual feature map reconstruction-based mobile robot navigation decision method according to claim 1, wherein the step (3.1) is realized by the following steps:
(a) establishing a robot body coordinate system with the center of a mobile robot chassis as an origin, wherein the direction vertical to the chassis is a Z-axis forward direction, the direction right in front of the mobile robot is an X-axis forward direction, and the Y-axis forward direction is determined based on a right-hand rule;
(b) determining an output decision action space A of the navigation decision model as { forward translation, backward translation, leftward translation, rightward translation, leftward rotation, rightward rotation, and stopping }, and specifying that the distance of each translation motion in the decision action space A is 0.5 m, and the angle of each rotation is 30 degrees;
(c) determining a reward function r in training a decision model t =r nav +r g +r c Wherein r is nav Reward, r, representing effective navigation of the mobile robot nav =Geo(I t-1 ,I g )-Geo(I t ,I g ),Geo(I t ,I g ) Geodesic distance, Geo (I), representing the current time of the mobile robot to the location of the target image t-1 ,I g ) The geodesic distance between the robot and the position of the target image at the previous moment is represented; r is a radical of hydrogen g A reward indicating that the mobile robot completed the navigation task, r when the mobile robot completed the navigation task g Not greater than 10.0, otherwise g =0;r c The penalty of collision in the navigation process of the mobile robot is shown, and when the mobile robot collides in the navigation process, r c -2.0, otherwise, r c =0。
7. The visual feature map reconstruction-based mobile robot navigation decision method as claimed in claim 1, wherein the navigation decision model based on the deep reinforcement learning framework A3C is composed of 5 convolutional layers, 3 first fully-connected layers, and 1 second fully-connected layer, which are connected in sequence; the number of channels of the convolutional layers is 512, 256, 128 and 128 respectively, the size of a convolutional core of each convolutional layer is 3 multiplied by 3, and the step length is 1; the sizes of the first fully-connected layer are 512, 128 and 64 respectively, the second fully-connected layer is provided with two branches, one branch is used for outputting 7-dimensional decision commands in a decision action space, and the other branch is used for outputting 1-dimensional decision evaluation values.
8. The visual feature map reconstruction-based mobile robot navigation decision method according to claim 1, wherein the step (3.3) is realized by the following steps: randomly initializing parameters of a navigation decision model based on a deep reinforcement learning framework A3C, updating the parameters of the navigation decision model by using a random gradient descent method SGD based on a loss function A3C, setting a learning rate to be 1e-4 and setting momentum to be 0.9; dividing 6 navigation tasks into 6 threads at random for execution in the same time period, and starting to calculate a loss function and a gradient and update navigation decision model parameters only after 10 navigation decisions are executed on the 6 navigation tasks each time; and iterating the process until the reward function obtained by the mobile robot in each round is not increased any more, completing the on-line training process to obtain a trained navigation decision model, and issuing a motion command required by the mobile robot to navigate to a target on the trained navigation decision model to realize the intelligent navigation of the mobile robot.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210207094.4A CN114594768B (en) | 2022-03-03 | 2022-03-03 | Mobile robot navigation decision-making method based on visual feature map reconstruction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210207094.4A CN114594768B (en) | 2022-03-03 | 2022-03-03 | Mobile robot navigation decision-making method based on visual feature map reconstruction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114594768A CN114594768A (en) | 2022-06-07 |
CN114594768B true CN114594768B (en) | 2022-08-23 |
Family
ID=81807371
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210207094.4A Active CN114594768B (en) | 2022-03-03 | 2022-03-03 | Mobile robot navigation decision-making method based on visual feature map reconstruction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114594768B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116989800B (en) * | 2023-09-27 | 2023-12-15 | 安徽大学 | Mobile robot visual navigation decision-making method based on pulse reinforcement learning |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109682392A (en) * | 2018-12-28 | 2019-04-26 | 山东大学 | Vision navigation method and system based on deeply study |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10147193B2 (en) * | 2017-03-10 | 2018-12-04 | TuSimple | System and method for semantic segmentation using hybrid dilated convolution (HDC) |
US10695911B2 (en) * | 2018-01-12 | 2020-06-30 | Futurewei Technologies, Inc. | Robot navigation and object tracking |
US10902616B2 (en) * | 2018-08-13 | 2021-01-26 | Nvidia Corporation | Scene embedding for visual navigation |
CN109902675B (en) * | 2018-09-17 | 2021-05-04 | 华为技术有限公司 | Object pose acquisition method and scene reconstruction method and device |
CN109459025B (en) * | 2018-11-08 | 2020-09-04 | 中北大学 | Similar brain navigation method based on optical flow UWB combination |
US11037051B2 (en) * | 2018-11-28 | 2021-06-15 | Nvidia Corporation | 3D plane detection and reconstruction using a monocular image |
CN110045740A (en) * | 2019-05-15 | 2019-07-23 | 长春师范大学 | A kind of Mobile Robot Real-time Motion planing method based on human behavior simulation |
US11263756B2 (en) * | 2019-12-09 | 2022-03-01 | Naver Corporation | Method and apparatus for semantic segmentation and depth completion using a convolutional neural network |
US11847730B2 (en) * | 2020-01-24 | 2023-12-19 | Covidien Lp | Orientation detection in fluoroscopic images |
US20210252698A1 (en) * | 2020-02-14 | 2021-08-19 | Nvidia Corporation | Robotic control using deep learning |
CN112180937A (en) * | 2020-10-14 | 2021-01-05 | 中国安全生产科学研究院 | Subway carriage disinfection robot and automatic navigation method thereof |
CN113093727B (en) * | 2021-03-08 | 2023-03-28 | 哈尔滨工业大学(深圳) | Robot map-free navigation method based on deep security reinforcement learning |
CN113096190B (en) * | 2021-03-27 | 2024-01-05 | 大连理工大学 | Omnidirectional mobile robot navigation method based on visual mapping |
CN113392584B (en) * | 2021-06-08 | 2022-12-16 | 华南理工大学 | Visual navigation method based on deep reinforcement learning and direction estimation |
CN215767102U (en) * | 2021-12-20 | 2022-02-08 | 中北大学 | Airborne inertial/polarized light/optical flow/visual combined navigation device |
-
2022
- 2022-03-03 CN CN202210207094.4A patent/CN114594768B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109682392A (en) * | 2018-12-28 | 2019-04-26 | 山东大学 | Vision navigation method and system based on deeply study |
Also Published As
Publication number | Publication date |
---|---|
CN114594768A (en) | 2022-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Singla et al. | Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge | |
EP3405845B1 (en) | Object-focused active three-dimensional reconstruction | |
Chen et al. | Brain-inspired cognitive model with attention for self-driving cars | |
CN108491880B (en) | Object classification and pose estimation method based on neural network | |
Fan et al. | Learning collision-free space detection from stereo images: Homography matrix brings better data augmentation | |
CN110874578B (en) | Unmanned aerial vehicle visual angle vehicle recognition tracking method based on reinforcement learning | |
JP6915909B2 (en) | Device movement control methods, control devices, storage media and electronic devices | |
CN107397658B (en) | Multi-scale full-convolution network and visual blind guiding method and device | |
CN105760894A (en) | Robot navigation method based on machine vision and machine learning | |
CN112232490A (en) | Deep simulation reinforcement learning driving strategy training method based on vision | |
CN110942512B (en) | Indoor scene reconstruction method based on meta-learning | |
CN106127125A (en) | Distributed DTW human body behavior intension recognizing method based on human body behavior characteristics | |
CN110260866A (en) | A kind of robot localization and barrier-avoiding method of view-based access control model sensor | |
CN112365604A (en) | AR equipment depth of field information application method based on semantic segmentation and SLAM | |
CN107363834B (en) | Mechanical arm grabbing method based on cognitive map | |
CN114594768B (en) | Mobile robot navigation decision-making method based on visual feature map reconstruction | |
CN113538218B (en) | Weak pairing image style migration method based on pose self-supervision countermeasure generation network | |
CN104850120A (en) | Wheel type mobile robot navigation method based on IHDR self-learning frame | |
WO2022228391A1 (en) | Terminal device positioning method and related device therefor | |
CN113752255B (en) | Mechanical arm six-degree-of-freedom real-time grabbing method based on deep reinforcement learning | |
CN106127119B (en) | Joint probabilistic data association method based on color image and depth image multiple features | |
Shi et al. | Underwater formation system design and implement for small spherical robots | |
CN117214904A (en) | Intelligent fish identification monitoring method and system based on multi-sensor data | |
CN112560571A (en) | Intelligent autonomous visual navigation method based on convolutional neural network | |
CN114594776B (en) | Navigation obstacle avoidance method based on layering and modular learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |