CN112327821A - Intelligent cleaning robot path planning method based on deep reinforcement learning - Google Patents

Intelligent cleaning robot path planning method based on deep reinforcement learning Download PDF

Info

Publication number
CN112327821A
CN112327821A CN202010651117.1A CN202010651117A CN112327821A CN 112327821 A CN112327821 A CN 112327821A CN 202010651117 A CN202010651117 A CN 202010651117A CN 112327821 A CN112327821 A CN 112327821A
Authority
CN
China
Prior art keywords
strategy
cleaning robot
path planning
reinforcement learning
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010651117.1A
Other languages
Chinese (zh)
Inventor
杜林�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongguan Junyi Vision Technology Co Ltd
Original Assignee
Dongguan Junyi Vision Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongguan Junyi Vision Technology Co Ltd filed Critical Dongguan Junyi Vision Technology Co Ltd
Priority to CN202010651117.1A priority Critical patent/CN112327821A/en
Publication of CN112327821A publication Critical patent/CN112327821A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • G05D1/0238Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using obstacle or wall sensors
    • G05D1/024Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using obstacle or wall sensors in combination with a laser
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0221Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • G05D1/0242Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using non-visible light signals, e.g. IR or UV signals
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • G05D1/0246Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0255Control of position or course in two dimensions specially adapted to land vehicles using acoustic signals, e.g. ultra-sonic singals
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0257Control of position or course in two dimensions specially adapted to land vehicles using a radar

Abstract

The invention discloses an intelligent cleaning robot path planning method based on deep reinforcement learning, which can realize the functions of cleaning a place with a large amount of garbage by a cleaning robot in priority, self-adapting obstacle avoidance, returning to charge in time and the like. The method is a deep reinforcement learning DDPG algorithm, and behavior strategies are generated through a strategy neural network and comprise a cleaning behavior strategy and a motion behavior strategy. And adding the behavior strategy into the exploration noise, sending the behavior strategy into the intelligent cleaning robot for execution, fusing through a sensor system to obtain state information, and calculating a current return value through a designed return function. The algorithm stores the trained state, action, next state and return value into an experience cache pool, randomly extracts experience, and trains the neural network by a gradient descent method. The method is reasonable, has strong practicability and is mainly used for indoor navigation.

Description

Intelligent cleaning robot path planning method based on deep reinforcement learning
Technical Field
The invention relates to the field of intelligent cleaning robots, in particular to an intelligent cleaning robot path planning method based on deep reinforcement learning.
Background
At present, with the development of the property management industry, the major backbone strength of most property service enterprises is employees over 50 years old, and young people are deficient. The research intelligent cleaning robot can effectively solve the problem of shortage of staff at the front line of the property industry, greatly promote the enterprise to rapidly output services outwards and increase the additional value of other services.
However, currently, indoor intelligent cleaning robot navigation is mainly based on an instant positioning And Mapping technology (SLAM), but the problems of incomplete cleaning of part of areas, low cleaning efficiency And the like are caused by the problem of path planning.
The Deep Deterministic Policy Gradient (DDPG) algorithm, as a classical algorithm in Deep reinforcement learning, has a great advantage in the aspect of continuous control.
The invention provides an intelligent cleaning robot path planning method based on deep reinforcement learning. The method is based on a DDPG algorithm, integrates various sensor information and realizes the dynamic planning of the path of the cleaning robot. The cleaning robot has the functions of preferentially cleaning places with much garbage, self-adapting obstacle avoidance, timely returning to charge and the like.
Disclosure of Invention
In order to overcome the defects and shortcomings in the prior art, the invention aims to provide an intelligent cleaning robot path planning method based on deep reinforcement learning so as to improve the working efficiency of a cleaning robot.
The invention is realized by the following technical scheme:
an intelligent cleaning robot path planning method based on deep reinforcement learning is characterized by comprising the following steps:
s1, initializing a strategy neural network, judging a network, a target strategy network, a target judging network, network parameters, an experience cache pool and a cleaning robot;
s2, the cleaning robot senses the surrounding environment through the sensor, fuses sensor data, and judges whether obstacles exist around the cleaning robot or not and the self state of the cleaning robot according to the ground condition of the cleaning robot, the garbage distribution condition and the surrounding obstacle;
s3, the strategy neural network receives sensor data of the surrounding environment, and after the sensor data are input into the strategy neural network, the strategy neural network selects an execution behavior strategy through calculation;
s4, the cleaning robot executes the action strategy, converts the action strategy into an instruction which can be recognized by the driving mechanism, and inputs the instruction into the driving mechanism;
s5, after the upper computer sends the instruction, the lower computer receives the instruction and executes the corresponding action to complete the cleaning task and the path planning, and the lower computer completes the execution to obtain the reward rt and the next state st + 1;
s6, judging whether the cleaning robot reaches the garbage station and whether the action time is finished, if so, continuing to execute the step S1 to the step S6, otherwise, summarizing the experience of the step S1 to the step S6 and executing the step S7;
s7, storing the experience in an experience cache pool, and using the experience cache pool to make the states independent to each other to eliminate strong correlation existing between input experiences;
s8, randomly sampling N experiences from the experience cache pool, and calculating the loss function value of the strategy value algorithm and the loss function value of the strategy decision algorithm.
S9, calculating the expected return of the current strategy through the target strategy network and the evaluation network, and estimating the accumulated return of each state strategy pair.
S10, training the neural network by adopting a gradient descent method, updating the weight coefficient of the target value network by using a random gradient descent algorithm to minimize a loss function, and calculating a gradient updating target strategy network and parameters of the strategy neural network.
Wherein, the sensor in step S2 may be one or more of a gyroscope, a laser radar, a camera, an ultrasonic wave, and an infrared.
The behavior strategy in step S3 includes a cleaning behavior strategy and a motion behavior strategy, where the cleaning behavior strategy includes washing, mopping, sweeping, and sucking, and the motion behavior strategy includes forward, backward, left-turning, right-turning, and braking.
Wherein, the driving mechanism in the step S4 includes one of a motion motor, a rolling brush motor, an edge brush motor, a rolling brush motor, a disc brush motor, a mop driving motor, and a dust collection motor.
In step S5, the size of the prize rt in step S is positively correlated with the number of collected garbage, the cleaning range, the obstacle avoidance, the power, and other factors.
In step S8, the loss function evaluation index is a mean square error.
In step S10, an Adam optimizer is used for the random gradient descent.
The invention has the beneficial effects that:
improve cleaning machines work efficiency: according to the intelligent cleaning robot path planning method based on deep reinforcement learning, modules such as a strategy neural network, a judgment network, a target strategy network, a target judgment network, network parameters and an experience cache pool are arranged, and calling and application of each module are realized by the method, so that positive feedback work of the cleaning robot is realized, and the working efficiency of the cleaning robot is improved.
Drawings
The invention is further illustrated by means of the attached drawings, but the embodiments in the drawings do not constitute any limitation to the invention, and for a person skilled in the art, other drawings can be obtained on the basis of the following drawings without inventive effort.
FIG. 1 is a block flow diagram of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention.
It should be noted that the structures shown in the drawings are only used for matching the disclosure of the present invention, so as to be understood and read by those skilled in the art, and are not used to limit the conditions of the present invention, so that the present invention has no technical significance, and any modifications or adjustments of the structures should still fall within the scope of the technical contents of the present invention without affecting the function and the achievable purpose of the present invention.
As shown in fig. 1, a schematic flow chart of an intelligent cleaning robot path planning method based on deep reinforcement learning according to an embodiment of the present invention includes:
step S1: initializing a strategy neural network, a judgment network, a target strategy neural network, a target judgment network and network parameters, initializing an experience cache pool, and initializing a cleaning robot;
step S2: the cleaning robot senses the surrounding environment through a sensor, integrates sensor data, constructs a map, and identifies the ground environment and the garbage condition based on a visual technology, wherein the sensor comprises a gyroscope, a laser radar, a camera, ultrasonic waves, infrared rays and the like, and the specific condition is sensor equipment configured according to the actual requirement of the cleaning robot;
step S3: the method comprises the following steps that a strategy neural network receives surrounding environment state data, after sensor data are input into the strategy neural network, the strategy neural network selects an execution strategy through calculation, a behavior strategy is a random process generated according to a current strategy and random noise, a value of the behavior strategy is obtained through sampling of the random process, the behavior strategy plans a place with a large amount of garbage to be cleaned preferentially, in the embodiment, a cleaning behavior strategy comprises behaviors of washing, dragging, sweeping and sucking, and a motion behavior strategy comprises behaviors of advancing, retreating, turning left, turning right and braking;
step S4: the cleaning robot executes a behavior strategy, converts the behavior strategy into an instruction which can be recognized by the motor, inputs the instruction into the motor, and further controls the rotating speed, the rotating direction, the rotating time and the like of the motor;
step S5: after the upper computer sends an instruction, the lower computer receives and executes corresponding actions to complete a cleaning task and path planning, judges whether garbage exists in the current indoor environment through the visual sensor, and obtains a reward rt and a next state st +1 after the execution is completed;
step S6: it is determined whether the cleaning robot reaches the garbage station and the operation time is over, and if the operation is over, the process proceeds to step S1. Otherwise, go to step S7;
step S7: storing experiences of executing actions, rewarding and the like into an experience cache pool, and using the experience cache pool to enable states to be mutually independent so as to eliminate strong correlation existing between input experiences;
step S8: and randomly sampling N experiences from an experience cache pool, and calculating a loss function value of the strategy value algorithm and a loss function value of the strategy decision algorithm, wherein preferably, the loss function evaluation index adopts a mean square error.
Step S9: and calculating the expected return of the current strategy through the target judgment neural network, and estimating the accumulated return of each state strategy pair.
Step S10: and training the neural network by adopting a gradient descent method. And updating the weight coefficients of the target value network by using a random gradient descent algorithm to minimize a loss function, and calculating parameters of a gradient updating target value neural network and a strategy neural network, wherein the Adam optimizer is preferably adopted for the medium random gradient descent.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the protection scope of the present invention, although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (7)

1. An intelligent cleaning robot path planning method based on deep reinforcement learning is characterized by comprising the following steps:
s1, initializing a strategy neural network, judging a network, a target strategy network, a target judging network, network parameters, an experience cache pool and a cleaning robot;
s2, the cleaning robot senses the surrounding environment through the sensor, fuses sensor data, and judges whether obstacles exist around the cleaning robot or not and the self state of the cleaning robot according to the ground condition of the cleaning robot, the garbage distribution condition and the surrounding obstacle;
s3, the strategy neural network receives sensor data of the surrounding environment, and after the sensor data are input into the strategy neural network, the strategy neural network selects an execution behavior strategy through calculation;
s4, the cleaning robot executes the action strategy, converts the action strategy into an instruction which can be recognized by the driving mechanism, and inputs the instruction into the driving mechanism;
s5, after the upper computer sends the instruction, the lower computer receives the instruction and executes the corresponding action to complete the cleaning task and the path planning, and the lower computer completes the execution to obtain the reward rt and the next state st + 1;
s6, judging whether the cleaning robot reaches the garbage station and whether the action time is finished, if so, continuing to execute the step S1 to the step S6, otherwise, summarizing the experience of the step S1 to the step S6 and executing the step S7;
s7, storing the experience in an experience cache pool, and using the experience cache pool to make the states independent to each other to eliminate strong correlation existing between input experiences;
s8, randomly sampling N experiences from the experience cache pool, and calculating the loss function value of the strategy value algorithm and the loss function value of the strategy decision algorithm.
S9, calculating the expected return of the current strategy through the target strategy network and the evaluation network, and estimating the accumulated return of each state strategy pair.
S10, training the neural network by adopting a gradient descent method, updating the weight coefficient of the target value network by using a random gradient descent algorithm to minimize a loss function, and calculating a gradient updating target strategy network and parameters of the strategy neural network.
2. The intelligent cleaning robot path planning method based on deep reinforcement learning of claim 1, wherein the sensor in step S2 can be one or more of a gyroscope, a laser radar, a camera, an ultrasonic wave, and an infrared.
3. The intelligent cleaning robot path planning method based on deep reinforcement learning of claim 1, wherein the behavior strategies in step S3 include cleaning behavior strategies including washing, mopping, sweeping and sucking behaviors and motion behavior strategies including forward, backward, left-turning, right-turning and braking behaviors.
4. The intelligent cleaning robot path planning method based on deep reinforcement learning of claim 1, wherein the driving mechanism in step S4 comprises one of a motion motor, a rolling brush motor, an edge brush motor, a rolling brush motor, a disc brush motor, a mop cloth driving motor, and a dust collection motor.
5. The intelligent cleaning robot path planning method based on the deep reinforcement learning of claim 1, wherein the size of the reward rt awarded in the step S5 is positively correlated with the factors of the quantity of collected garbage, the cleaning range, the obstacle avoidance, the electric quantity and the like.
6. The intelligent cleaning robot path planning method based on deep reinforcement learning of claim 1, wherein the loss function evaluation index in step S8 is mean square error.
7. The intelligent cleaning robot path planning method based on the deep reinforcement learning of claim 1, wherein an Adam optimizer is adopted in the step S10 of the random gradient descent.
CN202010651117.1A 2020-07-08 2020-07-08 Intelligent cleaning robot path planning method based on deep reinforcement learning Pending CN112327821A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010651117.1A CN112327821A (en) 2020-07-08 2020-07-08 Intelligent cleaning robot path planning method based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010651117.1A CN112327821A (en) 2020-07-08 2020-07-08 Intelligent cleaning robot path planning method based on deep reinforcement learning

Publications (1)

Publication Number Publication Date
CN112327821A true CN112327821A (en) 2021-02-05

Family

ID=74303637

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010651117.1A Pending CN112327821A (en) 2020-07-08 2020-07-08 Intelligent cleaning robot path planning method based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN112327821A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112859885A (en) * 2021-04-25 2021-05-28 四川远方云天食品科技有限公司 Cooperative optimization method for path of feeding robot
CN113534821A (en) * 2021-09-14 2021-10-22 深圳市元鼎智能创新有限公司 Multi-sensor fusion sweeping robot movement obstacle avoidance method and device and robot
CN114587190A (en) * 2021-08-23 2022-06-07 北京石头世纪科技股份有限公司 Control method, system and device of cleaning device and computer readable storage medium
CN115545350A (en) * 2022-11-28 2022-12-30 湖南工商大学 Comprehensive deep neural network and reinforcement learning vehicle path problem solving method
CN116611635A (en) * 2023-04-23 2023-08-18 暨南大学 Sanitation robot car scheduling method and system based on car-road cooperation and reinforcement learning
CN117666593A (en) * 2024-02-01 2024-03-08 厦门蓝旭科技有限公司 Walking control optimization method for photovoltaic cleaning robot

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106970615A (en) * 2017-03-21 2017-07-21 西北工业大学 A kind of real-time online paths planning method of deeply study
CN107168303A (en) * 2017-03-16 2017-09-15 中国科学院深圳先进技术研究院 A kind of automatic Pilot method and device of automobile
CN108415254A (en) * 2018-03-12 2018-08-17 苏州大学 Waste recovery robot control method based on depth Q networks and its device
CN108523768A (en) * 2018-03-12 2018-09-14 苏州大学 Household cleaning machine people's control system based on adaptive strategy optimization
US20180341378A1 (en) * 2015-11-25 2018-11-29 Supered Pty Ltd. Computer-implemented frameworks and methodologies configured to enable delivery of content and/or user interface functionality based on monitoring of activity in a user interface environment and/or control access to services delivered in an online environment responsive to operation of a risk assessment protocol
CN109726866A (en) * 2018-12-27 2019-05-07 浙江农林大学 Unmanned boat paths planning method based on Q learning neural network
CN109783412A (en) * 2019-01-18 2019-05-21 电子科技大学 A kind of method that deeply study accelerates training
CN109906132A (en) * 2016-09-15 2019-06-18 谷歌有限责任公司 The deeply of Robotic Manipulator learns
CN109976340A (en) * 2019-03-19 2019-07-05 中国人民解放军国防科技大学 Man-machine cooperation dynamic obstacle avoidance method and system based on deep reinforcement learning
CN110370295A (en) * 2019-07-02 2019-10-25 浙江大学 Soccer robot active control suction ball method based on deeply study
CN110597058A (en) * 2019-08-28 2019-12-20 浙江工业大学 Three-degree-of-freedom autonomous underwater vehicle control method based on reinforcement learning
CN111061277A (en) * 2019-12-31 2020-04-24 歌尔股份有限公司 Unmanned vehicle global path planning method and device
CN111260027A (en) * 2020-01-10 2020-06-09 电子科技大学 Intelligent agent automatic decision-making method based on reinforcement learning

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180341378A1 (en) * 2015-11-25 2018-11-29 Supered Pty Ltd. Computer-implemented frameworks and methodologies configured to enable delivery of content and/or user interface functionality based on monitoring of activity in a user interface environment and/or control access to services delivered in an online environment responsive to operation of a risk assessment protocol
CN109906132A (en) * 2016-09-15 2019-06-18 谷歌有限责任公司 The deeply of Robotic Manipulator learns
CN107168303A (en) * 2017-03-16 2017-09-15 中国科学院深圳先进技术研究院 A kind of automatic Pilot method and device of automobile
CN106970615A (en) * 2017-03-21 2017-07-21 西北工业大学 A kind of real-time online paths planning method of deeply study
CN108415254A (en) * 2018-03-12 2018-08-17 苏州大学 Waste recovery robot control method based on depth Q networks and its device
CN108523768A (en) * 2018-03-12 2018-09-14 苏州大学 Household cleaning machine people's control system based on adaptive strategy optimization
CN109726866A (en) * 2018-12-27 2019-05-07 浙江农林大学 Unmanned boat paths planning method based on Q learning neural network
CN109783412A (en) * 2019-01-18 2019-05-21 电子科技大学 A kind of method that deeply study accelerates training
CN109976340A (en) * 2019-03-19 2019-07-05 中国人民解放军国防科技大学 Man-machine cooperation dynamic obstacle avoidance method and system based on deep reinforcement learning
CN110370295A (en) * 2019-07-02 2019-10-25 浙江大学 Soccer robot active control suction ball method based on deeply study
CN110597058A (en) * 2019-08-28 2019-12-20 浙江工业大学 Three-degree-of-freedom autonomous underwater vehicle control method based on reinforcement learning
CN111061277A (en) * 2019-12-31 2020-04-24 歌尔股份有限公司 Unmanned vehicle global path planning method and device
CN111260027A (en) * 2020-01-10 2020-06-09 电子科技大学 Intelligent agent automatic decision-making method based on reinforcement learning

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112859885A (en) * 2021-04-25 2021-05-28 四川远方云天食品科技有限公司 Cooperative optimization method for path of feeding robot
CN114587190A (en) * 2021-08-23 2022-06-07 北京石头世纪科技股份有限公司 Control method, system and device of cleaning device and computer readable storage medium
CN113534821A (en) * 2021-09-14 2021-10-22 深圳市元鼎智能创新有限公司 Multi-sensor fusion sweeping robot movement obstacle avoidance method and device and robot
CN115545350A (en) * 2022-11-28 2022-12-30 湖南工商大学 Comprehensive deep neural network and reinforcement learning vehicle path problem solving method
CN115545350B (en) * 2022-11-28 2024-01-16 湖南工商大学 Vehicle path problem solving method integrating deep neural network and reinforcement learning
CN116611635A (en) * 2023-04-23 2023-08-18 暨南大学 Sanitation robot car scheduling method and system based on car-road cooperation and reinforcement learning
CN116611635B (en) * 2023-04-23 2024-01-30 暨南大学 Sanitation robot car scheduling method and system based on car-road cooperation and reinforcement learning
CN117666593A (en) * 2024-02-01 2024-03-08 厦门蓝旭科技有限公司 Walking control optimization method for photovoltaic cleaning robot
CN117666593B (en) * 2024-02-01 2024-04-09 厦门蓝旭科技有限公司 Walking control optimization method for photovoltaic cleaning robot

Similar Documents

Publication Publication Date Title
CN112327821A (en) Intelligent cleaning robot path planning method based on deep reinforcement learning
WO2021208225A1 (en) Obstacle avoidance method, apparatus, and device for epidemic-prevention disinfecting and cleaning robot
WO2021208380A1 (en) Disinfection and cleaning operation effect testing method and device for epidemic prevention disinfection and cleaning robot
Yang et al. A neural network approach to complete coverage path planning
CN104399682B (en) A kind of photovoltaic power station component cleans intelligent decision early warning system
CN108733061B (en) Path correction method for cleaning operation
Wang et al. Modeling motion patterns of dynamic objects by IOHMM
CN105139072A (en) Reinforcement learning algorithm applied to non-tracking intelligent trolley barrier-avoiding system
US11776409B2 (en) Methods, internet of things systems and storage mediums for street management in smart cities
Jung et al. Experiments in realising cooperation between autonomous mobile robots
AU2022204569B2 (en) Method for multi-agent dynamic path planning
EP2390740A2 (en) Autonomous machine selective consultation
CN108716201B (en) Collaborative sweeping method
CN111562784B (en) Disinfection method and equipment for mobile disinfection robot
CN109674404B (en) Obstacle avoidance processing mode of sweeping robot based on free move technology
CN111608124A (en) Autonomous navigation method for cleaning robot in high-speed service area
CN113566808A (en) Navigation path planning method, device, equipment and readable storage medium
CN112572466A (en) Control method of self-adaptive unmanned sweeper
CN114077807A (en) Computer implementation method and equipment for controlling mobile robot based on semantic environment diagram
CN114527762A (en) Automatic planning method for cleaning of photovoltaic cell panel
CN108762275B (en) Collaborative sweeping method
CN112168074B (en) Cleaning method and system of intelligent cleaning robot
CN117109574A (en) Agricultural transportation machinery coverage path planning method
CN108392153A (en) A kind of sweeping robot intelligence control system
CN107627314A (en) A kind of pathfinding robot, Pathfinding system and method for searching based on genetic algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination