CN114659524A - Simulation-based path planning method, system, electronic device and storage medium - Google Patents

Simulation-based path planning method, system, electronic device and storage medium Download PDF

Info

Publication number
CN114659524A
CN114659524A CN202210250343.8A CN202210250343A CN114659524A CN 114659524 A CN114659524 A CN 114659524A CN 202210250343 A CN202210250343 A CN 202210250343A CN 114659524 A CN114659524 A CN 114659524A
Authority
CN
China
Prior art keywords
virtual
path planning
scene
simulation
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210250343.8A
Other languages
Chinese (zh)
Inventor
卢志巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan United Imaging Zhirong Medical Technology Co Ltd
Original Assignee
Wuhan United Imaging Zhirong Medical Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan United Imaging Zhirong Medical Technology Co Ltd filed Critical Wuhan United Imaging Zhirong Medical Technology Co Ltd
Priority to CN202210250343.8A priority Critical patent/CN114659524A/en
Publication of CN114659524A publication Critical patent/CN114659524A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/005Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 with correlation of navigation data from several sources, e.g. map or contour matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Automation & Control Theory (AREA)
  • Genetics & Genomics (AREA)
  • Physiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a path planning method, a system, electronic equipment and a storage medium based on simulation, wherein the method comprises the following steps: constructing at least one virtual simulation scene, and adding a virtual camera and/or a virtual laser radar to each virtual simulation scene in the at least one virtual simulation scene; configuring a data acquisition controller for the virtual camera and/or the virtual laser radar, and controlling the virtual camera and/or the virtual laser radar to acquire at least one scene data corresponding to the at least one virtual simulation scene based on the data acquisition controller; and training a preset path planning algorithm based on the at least one scene data to obtain a target path planning algorithm, and planning a path according to the target path planning algorithm. The invention improves the environmental perception and learning ability of the target path planning algorithm.

Description

Simulation-based path planning method, system, electronic device and storage medium
Technical Field
The invention relates to the technical field of path planning algorithms, in particular to a path planning method and system based on simulation, electronic equipment and a storage medium.
Background
With the continuous development of modern machine manufacturing industry and robot technology, intelligent mobile robots have played a very important role in the fields of factory automation, construction, military, service, and the like. The requirements of people on mobile robots are not limited to movement, but can autonomously judge and find an optimal or approximately optimal path from a starting state to a target state in different environments. In recent years, the research on the problem of robot path navigation has received great attention from relevant researchers.
The traditional path planning algorithm comprises a simulated annealing algorithm, an artificial potential field method, a genetic algorithm and the like. Although the algorithms have strong path searching capability, the traditional path planning algorithm too depends on a real environment for training, and the path planning algorithm lacks perception and learning capability on the environment, so that when the path planning algorithm is applied to a new environment, retraining of a path planning algorithm is needed, and further the technical problem that path planning cannot be performed quickly and effectively is caused.
Disclosure of Invention
In view of the above, it is necessary to provide a method, a system, an electronic device and a storage medium for path planning based on simulation, so as to solve the technical problem in the prior art that path planning cannot be performed quickly and efficiently due to the lack of environment perception and learning capability of a path planning algorithm.
In one aspect, the present invention provides a path planning method based on simulation, including:
constructing at least one virtual simulation scene, and adding a virtual camera and/or a virtual laser radar to each virtual simulation scene in the at least one virtual simulation scene;
configuring a data acquisition controller for the virtual camera and/or the virtual laser radar, and controlling the virtual camera and/or the virtual laser radar to acquire at least one scene data corresponding to the at least one virtual simulation scene based on the data acquisition controller;
training a preset path planning algorithm based on the at least one scene data to obtain a target path plan, wherein in some possible implementation manners, the constructing of the at least one virtual simulation scene comprises:
creating a plurality of virtual objects and configuring at least one virtual controller for the plurality of virtual objects;
obtaining the at least one virtual simulation scene based on the at least one virtual controller adjusting the size and position of the plurality of virtual objects.
In some possible implementations, the controlling, based on the data acquisition controller, the virtual camera and/or the virtual lidar to acquire at least one scene data corresponding to the at least one virtual simulation scene includes:
based on the data acquisition controller, adjusting the scene configuration parameters of the at least one virtual simulation scene to obtain adjusted scene configuration parameters;
the virtual camera and/or the virtual lidar collects the at least one scene data corresponding to the at least one virtual simulation scene based on the adjusted scene configuration parameters.
In some possible implementations, the scene configuration parameters include a camera performance parameter, a camera position parameter, a camera pose parameter of the virtual camera, a lidar performance parameter, a lidar position parameter, and a lidar pose parameter of the virtual lidar.
In some possible implementations, after the building at least one virtual simulation scenario, the method further includes:
and creating an illumination controller, and adjusting the illumination parameters of the at least one virtual simulation scene through the illumination controller.
In some possible implementations, the preset path planning algorithm is a depth-deterministic policy gradient algorithm.
In some possible implementation manners, the training a preset path planning algorithm based on the scene data to obtain a target path planning algorithm includes:
constructing an initial path planning algorithm, wherein the initial path planning algorithm comprises a reality strategy network, a reality Q network, a target strategy network and a target Q network;
initializing network parameters of the reality strategy network and the reality Q network, and copying the network parameters into the target strategy network and the target Q network; the network parameters include a first weight of the real policy network and a second weight of the real Q network;
selecting a current moment behavior according to the at least one scene data and the reality strategy network, executing the current moment behavior, and returning an action reward of the current moment behavior and a next moment behavior;
acquiring a current time environment state and a next time environment state, and constructing a training data set according to the current time environment state, the next time environment state, the action reward and the next time behavior;
randomly sampling from the training data set to obtain a plurality of small batches of training data;
respectively training the reality strategy network and the reality Q network based on the small batch of training data to obtain a transition path planning algorithm;
determining a policy gradient of the real policy network and a gradient of the real Q network;
respectively optimizing the first weight and the second weight based on a preset optimization method to correspondingly obtain a first optimization weight and a second optimization weight;
determining a loss value of the transition path planning algorithm, and judging whether the loss value is smaller than a loss threshold value; if the loss value is smaller than the loss threshold value, the transition path planning algorithm is the target path planning algorithm; and if the loss value is greater than or equal to the loss threshold value, updating the transition path planning algorithm.
In another aspect, the present invention further provides a path planning system based on simulation, including: the system comprises a virtual simulation scene construction module, a scene data acquisition module and a path planning module;
the virtual simulation scene construction module is used for constructing at least one virtual simulation scene and adding a virtual camera and/or a virtual laser radar to each virtual simulation scene in the at least one virtual simulation scene;
the scene data acquisition module is used for configuring a data acquisition controller for the virtual camera and/or the virtual laser radar and controlling the virtual camera and/or the virtual laser radar to acquire at least one scene data corresponding to the at least one virtual simulation scene based on the data acquisition controller;
the path planning module is used for training a preset path planning algorithm based on the at least one scene data, obtaining a target path planning algorithm and planning a path according to the target path planning algorithm.
In another aspect, the present invention also provides an electronic device comprising a memory and a processor, wherein,
the memory is used for storing programs;
the processor, coupled to the memory, is configured to execute the program stored in the memory to implement the steps in the simulation-based path planning method in any of the above implementation manners.
In another aspect, the present invention further provides a computer-readable storage medium for storing a computer-readable program or instruction, which when executed by a processor can implement the steps in the simulation-based path planning method described in any one of the above implementation manners.
The beneficial effects of adopting the embodiment are as follows: the invention provides a path planning method based on simulation, which comprises the steps of firstly constructing at least one virtual simulation scene, and adding a virtual camera and/or a virtual laser radar to each virtual simulation scene in the at least one virtual simulation scene; and then configuring a data acquisition controller for the virtual camera and/or the virtual laser radar, controlling the virtual camera and/or the virtual laser radar to acquire at least one scene data corresponding to at least one virtual simulation scene based on the data acquisition controller, finally training a preset path planning algorithm based on the at least one scene data to obtain a target path planning algorithm, and planning a path according to the target path planning algorithm. The method can construct at least one type of virtual simulation scene according to the requirements, and improves the environmental perception and learning capacity of the trained target path planning algorithm, so that the new environmental adaptability of the target path planning algorithm can be improved, and the effectiveness of path planning in a new environment can be improved. Furthermore, the method and the device can realize the training of the preset path planning algorithm without any real hardware equipment (such as a real camera, a real laser radar and the like), improve the convenience of training the preset path planning algorithm and further improve the obtaining efficiency of the target planning path algorithm. Meanwhile, the training cost for training the path planning algorithm can be reduced due to the fact that no real hardware equipment is needed.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart illustrating an embodiment of a simulation-based path planning method according to the present invention;
FIG. 2 is a schematic flow chart of one embodiment of S101 of FIG. 1;
FIG. 3 is a schematic flow chart of one embodiment of S102 of FIG. 1;
FIG. 4 is a schematic flow chart of one embodiment of S103 of FIG. 1;
FIG. 5 is a schematic structural diagram of an embodiment of a simulation-based path planning system according to the present invention;
fig. 6 is a schematic structural diagram of an embodiment of an electronic device provided in the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the embodiments of the present invention, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that three relationships may exist, for example: a and/or B, may represent: a exists alone, A and B exist simultaneously, and B exists alone.
The terms "comprises," "comprising," and any other variation thereof, in the embodiments of the present invention, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules explicitly listed, but may include other steps or modules not expressly listed or inherent to such process, method, article, or apparatus.
The naming or numbering of the steps appearing in the embodiments of the present invention does not mean that the steps in the method flow must be executed according to the chronological/logical order indicated by the naming or numbering, and the named or numbered steps of the flow may change the execution order according to the technical purpose to be achieved, as long as the same or similar technical effects are achieved.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The invention provides a path planning method, a path planning system, electronic equipment and a storage medium based on simulation, which are respectively explained below.
Fig. 1 is a schematic flowchart of an embodiment of a simulation-based path planning method provided by the present invention, and as shown in fig. 1, the simulation-based path planning method includes:
s101, constructing at least one virtual simulation scene, and adding a virtual camera and/or a virtual laser radar to each virtual simulation scene in the at least one virtual simulation scene;
s102, configuring a data acquisition controller for the virtual camera and/or the virtual laser radar, and controlling the virtual camera and/or the virtual laser radar to acquire at least one scene data corresponding to at least one virtual simulation scene based on the data acquisition controller;
s103, training a preset path planning algorithm based on at least one scene data to obtain a target path planning algorithm, and planning a path according to the target path planning algorithm.
Compared with the prior art, the simulation-based path planning method provided by the embodiment of the invention comprises the steps of firstly constructing at least one virtual simulation scene, and adding a virtual camera and/or a virtual laser radar to each virtual simulation scene in the at least one virtual simulation scene; and then configuring a data acquisition controller for the virtual camera and/or the virtual laser radar, controlling the virtual camera and/or the virtual laser radar to acquire at least one scene data corresponding to at least one virtual simulation scene based on the data acquisition controller, finally training a preset path planning algorithm based on the at least one scene data to obtain a target path planning algorithm, and planning a path according to the target path planning algorithm. According to the embodiment of the invention, at least one type of virtual simulation scene can be constructed according to requirements, and the environmental perception and learning capacity of the trained target path planning algorithm are improved, so that the new environmental adaptability of the target path planning algorithm can be improved, and the effectiveness of the path planning algorithm in a new environment is improved. Furthermore, the embodiment of the invention can realize the training of the preset path planning algorithm without any real hardware equipment (such as a real camera, a real laser radar and the like), thereby improving the convenience of training the preset path planning algorithm and further improving the obtaining efficiency of the target planning path algorithm. Meanwhile, the training cost for training the path planning algorithm can be reduced because no real hardware equipment is needed.
It should be noted that: the virtual simulation scene in step S101 may be a virtual simulation scene of various moving target objects (including robots and the like) on various forms of paths (including indoor paths and the like).
In one embodiment of the present invention, the virtual simulation scene refers to a medical operation scene, and includes, but is not limited to, a doctor, a nurse, an operation device (an operation robot, a lamp, an operation table, various medical devices, etc.), a cabinet, and the like, which perform an operation.
It should also be noted that: the scene data in step S102 includes scene image data acquired by the virtual camera and/or scene point cloud data acquired by the virtual lidar.
In some embodiments of the present invention, as shown in fig. 2, step S101 includes:
s201, creating a plurality of virtual objects, and configuring at least one virtual controller for the virtual objects;
s202, adjusting the sizes and the positions of the virtual objects based on at least one virtual controller to obtain at least one virtual simulation scene.
According to the embodiment of the invention, the sizes and positions of the virtual objects can be adjusted according to requirements by arranging the at least one virtual controller, so that at least one virtual simulation scene is obtained, and the convenience for constructing the at least one virtual simulation scene is improved.
In order to facilitate the subsequent virtual controller to control the virtual object corresponding to the subsequent virtual controller, in some embodiments of the present invention, each virtual object includes a unique identification number (ID number), and the virtual controller can control the virtual object according to the identification number, thereby improving the reliability of controlling the virtual object.
In order to reduce the number of virtual controllers, in some embodiments of the present invention, one virtual controller may be configured for a plurality of virtual objects, that is: all virtual objects are controlled through one virtual controller, and the control integration level is improved. Specifically, the method comprises the following steps: the virtual controller realizes the control of each virtual object by acquiring the identity number of the virtual object.
It should be understood that: to avoid control errors caused by controlling all virtual objects with only one virtual controller, in some embodiments of the invention, one virtual controller may be configured for each virtual object. The virtual controller corresponding to each virtual object controls the virtual object, so that the control precision of the virtual object can be improved.
In some embodiments of the present invention, in order to reduce the number of virtual controllers while ensuring reliability of controlling virtual objects, the same number of virtual controllers as the number of categories may be correspondingly set according to the categories of virtual objects, that is: a virtual controller controls at least one virtual object of a class. For example: when the plurality of virtual objects include a plurality of virtual human bodies and a plurality of virtual devices, two virtual controllers may be provided to control the plurality of virtual human bodies and the plurality of virtual devices, respectively.
In a conventional path planning algorithm, it often occurs that the path planning algorithm is affected by the performance, the setting position and other factors of a camera and/or a laser radar. Therefore, in order to improve the comprehensiveness of the obtained scene data to improve the perception and learning ability of the path planning algorithm for these factors, in some embodiments of the present invention, as shown in fig. 3, step S102 includes:
s301, adjusting scene configuration parameters of at least one virtual simulation scene based on a data acquisition controller to obtain adjusted scene configuration parameters;
s302, the virtual camera and/or the virtual laser radar collects at least one scene data corresponding to at least one virtual simulation scene based on the adjustment of the scene configuration parameters.
The scene configuration parameters include, but are not limited to, camera performance parameters, camera position parameters, camera attitude parameters of the virtual camera, laser radar performance parameters, laser radar position parameters, laser radar attitude parameters, and the like of the virtual laser radar.
Because the scene configuration parameters are different, the obtained scene data of the virtual simulation scene are different, so the scene configuration parameters of the virtual simulation scene are adjusted to obtain the scene data corresponding to the virtual simulation scene under the different scene configuration parameters, and the comprehensiveness of the obtained scene data can be improved, so that the learning and perception abilities of the path planning algorithm on the virtual simulation scene corresponding to at least one different scene configuration parameter can be improved, and the path planning reliability of the path planning algorithm is further improved.
In some embodiments of the present invention, the camera performance parameters include, but are not limited to, the virtual camera's field of view (FOV), focal point, photosensitive area, and the like. Lidar performance parameters include, but are not limited to, field angle, detection distance, number of points out, etc. of the virtual lidar.
Since both the virtual camera and the virtual lidar are affected by the illumination condition, in order to further improve the comprehensiveness of the scene data and improve the reliability of the path planning algorithm, in some embodiments of the present invention, after step S101, the method further includes:
and creating an illumination controller, and adjusting the illumination parameters of at least one virtual simulation scene through the illumination controller.
According to the embodiment of the invention, the illumination controller is created, and the illumination parameters of at least one virtual simulation scene are adjusted through the illumination controller, so that the scene data of the virtual simulation scene under different illumination parameters can be obtained, the comprehensiveness of the obtained scene data is further improved, and the path planning reliability of the path planning algorithm can be further improved.
Wherein the lighting parameters include, but are not limited to, lighting intensity and lighting type. For example: the adjustable illumination is white light, warm light or other special illumination.
It should be noted that: any one of parameters such as an illumination parameter, a camera performance parameter, a camera position parameter, a camera attitude parameter, a lidar performance parameter of a virtual lidar, a lidar position parameter, and a lidar attitude parameter can change to make scene data different, so that the number of the obtained scene data can be increased by setting and adjusting the parameters, the sample number for training a path planning algorithm is increased, and the reliability of a target path planning algorithm is improved.
It should also be noted that: the path planning Algorithm preset in step S103 may be any one of a Particle Swarm Optimization (PSO), a Genetic Algorithm (GA), or a reinforcement learning Algorithm.
For example: the preset path planning algorithm may be Q-learning (Q-learning) algorithm, Deep Q-learning (DQN) algorithm, competitive architecture based Q-learning (dulling Deep Q-learning, dulling-DQN) algorithm, Deep double Q-learning (DDQN) algorithm, Deep Deterministic Policy Gradient (DDPG) algorithm.
Since the DDPG algorithm can satisfy the input of high-dimensional actions in the state space, which is a great advantage in the continuous control problem, in the specific embodiment of the present invention, the preset path planning algorithm is the DDPG algorithm. Then, as shown in fig. 4, step S103 includes:
s401, constructing an initial path planning algorithm, wherein the initial path planning algorithm comprises a real strategy network, a real Q network, a target strategy network and a target Q network;
s402, initializing network parameters of a real strategy network and a real Q network, and copying the network parameters into a target strategy network and a target Q network; the network parameters comprise a first weight of a real policy network and a second weight of a real Q network;
s403, selecting a current moment behavior according to at least one scene data and a reality strategy network, executing the current moment behavior, and returning to action rewards of the current moment behavior and the next moment behavior;
s404, acquiring the current time environment state and the next time environment state, and constructing a training data set according to the current time environment state, the next time environment state, the action reward and the next time behavior;
s405, randomly sampling from the training data set to obtain a plurality of small-batch training data;
s406, respectively training a real strategy network and a real Q network based on a plurality of small batch of training data to obtain a transition path planning algorithm;
s407, determining a strategy gradient of a real strategy network and a gradient of a real Q network;
s408, respectively optimizing the first weight and the second weight based on a preset optimization method to correspondingly obtain a first optimization weight and a second optimization weight;
s409, determining a loss value of the transition path planning algorithm, and judging whether the loss value is smaller than a loss threshold value; if the loss value is smaller than the loss threshold value, the transition path planning algorithm is a target path planning algorithm; and if the loss value is greater than or equal to the loss threshold value, updating the transition path planning algorithm.
Specifically, the method comprises the following steps: the initial path planning algorithm and the target path planning algorithm are both used for path planning of the surgical robot. The corresponding current time behavior can be the speed, the position and the like of the surgical robot at the current time; the next-time behavior is the speed, position, etc. of the surgical robot at the next time.
It should be noted that: the action rewards in step S403 and step S404 include, but are not limited to, a reward value for the surgical robot completing the path planning within a predetermined time, a reward value when the surgical robot touches an obstacle, a position reward when the surgical robot reaches a destination, and the like.
It should also be noted that: the preset optimization algorithm in step S408 may be any one of a Stochastic Gradient Descent optimization (SGD), a Momentum optimization algorithm (Momentum), an adaptive Gradient algorithm (AdaGrad), a root-mean-square back propagation (RMSProp) optimization method, and an adaptive Momentum optimization (Adam) algorithm.
In order to improve the optimization efficiency, in a specific embodiment of the present invention, the preset optimization algorithm is an Adam optimization algorithm.
It is further noted that: the update transition path planning algorithm in step S409 specifically includes: step S403 to step S408 are repeated.
In order to better implement the simulation-based path planning method in the embodiment of the present invention, on the basis of the simulation-based path planning method, as shown in fig. 5, correspondingly, the embodiment of the present invention further provides a simulation-based path planning system, where the simulation-based path planning system 500 includes: a virtual simulation scene construction module 501, a scene data acquisition module 502 and a path planning module 503;
the virtual simulation scene constructing module 501 is configured to construct at least one virtual simulation scene, and add a virtual camera and/or a virtual lidar scene to each virtual simulation scene in the at least one virtual simulation scene;
the scene data acquiring module 502 is configured to configure a data acquisition controller for the virtual camera and/or the virtual lidar, and control the virtual camera and/or the virtual lidar to acquire at least one scene data corresponding to at least one virtual simulation scene based on the data acquisition controller;
the path planning module 503 is configured to train a preset path planning algorithm based on at least one scene data, obtain a target path planning algorithm, and plan a path according to the target path planning algorithm.
The simulation-based path planning system 500 provided in the foregoing embodiment may implement the technical solutions described in the foregoing simulation-based path planning method embodiments, and the specific implementation principles of the modules or units may refer to the corresponding contents in the foregoing simulation-based path planning method embodiments, and are not described herein again.
As shown in fig. 6, the present invention further provides an electronic device 600 accordingly. The electronic device 600 comprises a processor 601, a memory 602 and a display 603. Fig. 6 shows only some of the components of the electronic device 600, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
Processor 601, which in some embodiments may be a Central Processing Unit (CPU), microprocessor or other data Processing chip, executes program code or processes data stored in memory 602, such as the simulation-based path planning method of the present invention.
In some embodiments, processor 601 may be a single server or a group of servers. The server groups may be centralized or distributed. In some embodiments, the processor 601 may be local or remote. In some embodiments, processor 601 may be implemented in a cloud platform. In an embodiment, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an intra-house, a multi-cloud, and the like, or any combination thereof.
The storage 602 may be an internal storage unit of the electronic device 600 in some embodiments, such as a hard disk or a memory of the electronic device 600. The memory 602 may also be an external storage device of the electronic device 600 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the electronic device 600.
Further, the memory 602 may also include both internal storage units and external storage devices of the electronic device 600. The memory 602 is used for storing application software and various types of data for installing the electronic device 600.
The display 603 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display 603 is used for displaying information at the electronic device 600 and for displaying a visual user interface. The components 601 and 603 of the electronic device 600 communicate with each other via a system bus.
In some embodiments of the invention, when the processor 601 executes the simulation-based path planning program in the memory 602, the following steps may be implemented:
constructing at least one virtual simulation scene, and adding a virtual camera and/or a virtual laser radar to each virtual simulation scene in the at least one virtual simulation scene;
configuring a data acquisition controller for the virtual camera and/or the virtual laser radar, and controlling the virtual camera and/or the virtual laser radar to acquire at least one scene data corresponding to at least one virtual simulation scene based on the data acquisition controller;
training a preset path planning algorithm based on at least one scene data to obtain a target path planning algorithm, and planning a path according to the target path planning algorithm.
It should be understood that: the processor 601, when executing the simulation-based path planning program in the memory 602, may also perform other functions in addition to the above functions, which may be specifically referred to in the description of the corresponding method embodiments above.
Further, the type of the electronic device 600 is not particularly limited in the embodiment of the present invention, and the electronic device 600 may be a portable electronic device such as a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a wearable device, and a laptop computer (laptop). Exemplary embodiments of portable electronic devices include, but are not limited to, portable electronic devices that carry an IOS, android, microsoft, or other operating system. The portable electronic device may also be other portable electronic devices such as laptop computers (laptop) with touch sensitive surfaces (e.g., touch panels), etc. It should also be understood that in other embodiments of the present invention, the electronic device 600 may not be a portable electronic device, but may be a desktop computer having a touch-sensitive surface (e.g., a touch pad).
Accordingly, the present application further provides a computer-readable storage medium, which is used for storing a computer-readable program or instruction, and when the program or instruction is executed by a processor, the program or instruction can implement the steps or functions in the simulation-based path planning method provided by the above method embodiments.
Those skilled in the art will appreciate that all or part of the flow of the method implementing the above embodiments may be implemented by instructing relevant hardware (such as a processor, a controller, etc.) by a computer program, and the computer program may be stored in a computer readable storage medium. The computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory.
The simulation-based path planning method, system, electronic device and storage medium provided by the present invention are described in detail above, and a specific example is applied in the present document to explain the principle and implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for those skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A simulation-based path planning method is characterized by comprising the following steps:
constructing at least one virtual simulation scene, and adding a virtual camera and/or a virtual laser radar to each virtual simulation scene in the at least one virtual simulation scene;
configuring a data acquisition controller for the virtual camera and/or the virtual laser radar, and controlling the virtual camera and/or the virtual laser radar to acquire at least one scene data corresponding to the at least one virtual simulation scene based on the data acquisition controller;
training a preset path planning algorithm based on the at least one scene data to obtain a target path planning algorithm, and planning a path according to the target path planning algorithm.
2. The simulation-based path planning method of claim 1, wherein the constructing at least one virtual simulation scenario comprises:
creating a plurality of virtual objects and configuring at least one virtual controller for the plurality of virtual objects;
obtaining the at least one virtual simulation scene based on the at least one virtual controller adjusting the size and position of the plurality of virtual objects.
3. The simulation-based path planning method according to claim 1, wherein the controlling the virtual camera and/or the virtual lidar to collect at least one scene data corresponding to the at least one virtual simulation scene based on the data collection controller comprises:
based on the data acquisition controller, adjusting the scene configuration parameters of the at least one virtual simulation scene to obtain adjusted scene configuration parameters;
the virtual camera and/or the virtual lidar collects the at least one scene data corresponding to the at least one virtual simulation scene based on the adjusted scene configuration parameters.
4. The simulation-based path planning method of claim 3, wherein the scene configuration parameters comprise camera performance parameters, camera position parameters, camera pose parameters of the virtual camera, lidar performance parameters, lidar position parameters, and lidar pose parameters of the virtual lidar.
5. The simulation-based path planning method of claim 1, wherein after said constructing at least one virtual simulation scenario, further comprising:
and creating an illumination controller, and adjusting the illumination parameters of the at least one virtual simulation scene through the illumination controller.
6. The simulation-based path planning method of claim 1, wherein the pre-defined path planning algorithm is a depth-deterministic policy gradient algorithm.
7. The simulation-based path planning method of claim 6, wherein the training of a preset path planning algorithm based on the scene data to obtain a target path planning algorithm comprises:
constructing an initial path planning algorithm, wherein the initial path planning algorithm comprises a reality strategy network, a reality Q network, a target strategy network and a target Q network;
initializing network parameters of the reality strategy network and the reality Q network, and copying the network parameters into the target strategy network and the target Q network; the network parameters include a first weight of the real policy network and a second weight of the real Q network;
selecting a current moment behavior according to the at least one scene data and the reality strategy network, executing the current moment behavior, and returning an action reward of the current moment behavior and a next moment behavior;
acquiring a current time environment state and a next time environment state, and constructing a training data set according to the current time environment state, the next time environment state, the action reward and the next time behavior;
randomly sampling from the training data set to obtain a plurality of small batches of training data;
respectively training the reality strategy network and the reality Q network based on the small batch of training data to obtain a transition path planning algorithm;
determining a policy gradient of the real policy network and a gradient of the real Q network;
respectively optimizing the first weight and the second weight based on a preset optimization method to correspondingly obtain a first optimization weight and a second optimization weight;
determining a loss value of the transition path planning algorithm, and judging whether the loss value is smaller than a loss threshold value; if the loss value is smaller than the loss threshold value, the transition path planning algorithm is the target path planning algorithm; and if the loss value is greater than or equal to the loss threshold value, updating the transition path planning algorithm.
8. A simulation-based path planning system, comprising: the system comprises a virtual simulation scene construction module, a scene data acquisition module and a path planning module;
the virtual simulation scene construction module is used for constructing at least one virtual simulation scene and adding a virtual camera and/or a virtual laser radar to each virtual simulation scene in the at least one virtual simulation scene;
the scene data acquisition module is used for configuring a data acquisition controller for the virtual camera and/or the virtual laser radar and controlling the virtual camera and/or the virtual laser radar to acquire at least one scene data corresponding to the at least one virtual simulation scene based on the data acquisition controller;
the path planning module is used for training a preset path planning algorithm based on the at least one scene data, obtaining a target path planning algorithm and planning a path according to the target path planning algorithm.
9. An electronic device comprising a memory and a processor, wherein,
the memory is used for storing programs;
the processor, coupled to the memory, is configured to execute the program stored in the memory to implement the steps in the simulation-based path planning method according to any of the preceding claims 1 to 7.
10. A computer-readable storage medium storing a computer-readable program or instructions, which when executed by a processor, is capable of implementing the steps of the simulation-based path planning method according to any one of claims 1 to 7.
CN202210250343.8A 2022-03-09 2022-03-09 Simulation-based path planning method, system, electronic device and storage medium Pending CN114659524A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210250343.8A CN114659524A (en) 2022-03-09 2022-03-09 Simulation-based path planning method, system, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210250343.8A CN114659524A (en) 2022-03-09 2022-03-09 Simulation-based path planning method, system, electronic device and storage medium

Publications (1)

Publication Number Publication Date
CN114659524A true CN114659524A (en) 2022-06-24

Family

ID=82029529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210250343.8A Pending CN114659524A (en) 2022-03-09 2022-03-09 Simulation-based path planning method, system, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN114659524A (en)

Similar Documents

Publication Publication Date Title
Chen et al. Robot as a service in cloud computing
CN110471297B (en) Multi-agent cooperative control method, system and equipment
US8886829B1 (en) Methods and systems for robot cloud computing using slug trails
CN103559182B (en) Systems and methods for asynchronous searching and filtering of data
CN112001585A (en) Multi-agent decision method and device, electronic equipment and storage medium
US11537022B1 (en) Dynamic tenancy
US20210081240A1 (en) Evolutionary modelling based non-disruptive scheduling and management of computation jobs
CN111095170B (en) Virtual reality scene, interaction method thereof and terminal equipment
US11036474B2 (en) Automating service maturity analysis and estimation
CN104317297A (en) Robot obstacle avoidance method under unknown environment
Sanchis et al. Using natural interfaces for human-agent immersion
WO2021248856A1 (en) Robot control method and, system, storage medium and smart robot
CN108549487A (en) Virtual reality exchange method and device
CN114091589B (en) Model training method and device, electronic equipment and medium
CN114924862A (en) Task processing method, device and medium implemented by integer programming solver
CN112016678A (en) Training method and device for strategy generation network for reinforcement learning and electronic equipment
CN114659524A (en) Simulation-based path planning method, system, electronic device and storage medium
JP2017000575A (en) Computer program
CN113390412B (en) Full-coverage path planning method and system for robot, electronic equipment and medium
WO2018220405A1 (en) Methods of and apparatus for locating energy harvesting devices in an environment
JP7380556B2 (en) Information processing device, information processing method and program
CN112270083A (en) Multi-resolution modeling and simulation method and system
CN116561478A (en) Transformer substation plane layout method based on butterfly algorithm of mixed particle swarm
US9563723B1 (en) Generation of an observer view in a virtual environment
JP5838279B1 (en) Server device, server program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination