CN117961899A - Frog-like robot control method, frog-like robot control device, frog-like robot, and storage medium - Google Patents

Frog-like robot control method, frog-like robot control device, frog-like robot, and storage medium Download PDF

Info

Publication number
CN117961899A
CN117961899A CN202410163775.4A CN202410163775A CN117961899A CN 117961899 A CN117961899 A CN 117961899A CN 202410163775 A CN202410163775 A CN 202410163775A CN 117961899 A CN117961899 A CN 117961899A
Authority
CN
China
Prior art keywords
robot
frog
position information
parameters
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410163775.4A
Other languages
Chinese (zh)
Inventor
郑俊华
高开元
赵泽锋
彭茂棋
刘胤烨
梁晓裕
王杰彬
陈泺业
陈嘉鑫
刘杰锋
肖桂英
宋露曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Institute of Technology
Original Assignee
Guangzhou Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Institute of Technology filed Critical Guangzhou Institute of Technology
Priority to CN202410163775.4A priority Critical patent/CN117961899A/en
Publication of CN117961899A publication Critical patent/CN117961899A/en
Pending legal-status Critical Current

Links

Abstract

The application discloses a control method of a frog-imitating robot, and belongs to the technical field of robots. Wherein the method comprises the following steps: acquiring state data of a robot and environment data of a position of the robot at the initial moment when the robot is detected to be in an operation period; determining target parameters input into a machine learning model according to the state data and the environment data, inputting the target parameters into the machine learning model, and outputting operation parameters of the operation period; and controlling the robot according to the operation parameters. The application realizes the improvement of the accuracy of controlling the robot to move in water.

Description

Frog-like robot control method, frog-like robot control device, frog-like robot, and storage medium
Technical Field
The present invention relates to the field of robots, and more particularly, to a control method of a frog-imitating robot, and a storage medium.
Background
The frog-like robot is an amphibious robot, the moving control mode of the frog-like robot in water is complex compared with a machine driven by a propeller, at present, a fixed motor control program is arranged according to the situation of a task to realize the forward steering of the frog-like robot in water, but the driving method is limited to a static water area, and the accuracy of movement is low due to the fact that the change of an environment cannot be perceived in a dynamic water area.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide a frog-imitating robot control method, a frog-imitating robot control device, a robot and a storage medium, and aims to improve the accuracy of controlling the robot to move in water.
In order to achieve the above object, the present invention provides a control method of a frog-imitating robot, the control method of the frog-imitating robot comprising the steps of:
acquiring state data of a robot and environment data of a position of the robot at the initial moment when the robot is detected to be in an operation period;
Determining target parameters input into a machine learning model according to the state data and the environment data, inputting the target parameters into the machine learning model, and outputting operation parameters of the operation period;
and controlling the robot according to the operation parameters.
Optionally, the machine learning model comprises a neural network model, and the state data comprises: the method comprises the steps of inputting the target parameter into the machine learning model and outputting the running parameter of the running period, wherein the current position information, the historical position information and the historical running parameter of the initial moment are respectively the current position information and the running parameter of the adjacent running period before the initial moment, and the method further comprises the following steps:
inputting the historical position information, the current position information and the environmental data into the neural network model, and outputting calibration operation parameters;
determining a loss value based on the historical operating parameter and the calibration operating parameter;
And adjusting the neural network model according to the loss value.
Optionally, the machine learning model is ensemble learning, which includes: at least two individual learners and a combining strategy, the step of inputting the target parameters into the machine learning model and outputting the operating parameters of the operating cycle comprising:
Inputting the target parameters into at least two individual learners to obtain at least two individual output results;
determining the operating parameter based on the combination strategy and at least two individual output results.
Optionally, the state data includes current position information of the initial time, first target position information, and second target position information of a period before the operation period corresponding to the initial time, and before the step of inputting the target parameter into the machine learning model and outputting the operation parameter of the operation period, the method further includes:
when the distance between the current position information and the second target position information is larger than a preset distance threshold value, updating the first target position and/or adjusting the combination strategy;
and when the distance between the current position information and the second target position information is smaller than or equal to a preset distance threshold value, determining to execute the step of inputting the target parameters into the machine learning model and outputting the operation parameters of the operation period.
Optionally, the combining policy is a predictor weighting method, and the adjusting the combining policy includes:
According to whether the environment data is consistent with the historical environment data of the previous period of the operation period corresponding to the initial moment;
Updating corresponding weighting weights of at least two of the individual learners when the environmental data and the historical environmental data are inconsistent;
and updating the corresponding weighting weights of all the individual learners when the environment data is consistent with the historical environment data.
Optionally, the state data includes a moving speed, a moving direction of the robot, current position information of the initial time, and first target position information, the environment data includes a water flow direction and a water flow speed of the water flow, and the step of determining the target parameters input into the machine learning model according to the state data and the environment data includes:
And determining the target parameters as the moving speed, the moving direction, the current position information, the first target position information, the water flow direction and the water flow speed according to the state data and the environment data.
Optionally, before the step of acquiring the state data of the robot and the environmental parameter data of the position of the robot at the initial time of detecting that the robot is in the operation cycle, the method further includes:
and determining the number of the running periods and the target position of each running period according to the starting position, the ending position and the initial environment data of the robot.
In addition, in order to achieve the above object, the present invention also provides a control device of a frog-imitating robot, the control device of the frog-imitating robot comprising:
the detection module is used for acquiring state data of the robot and environment data of the position of the robot when the robot is detected to be at the initial moment of the running period;
the computing module is used for determining target parameters input into a machine learning model according to the state data and the environment data, inputting the target parameters into the machine learning model and outputting the operation parameters of the operation period;
and the control module is used for controlling the robot according to the operation parameters.
In addition, in order to achieve the above object, the present invention also provides a robot including: the control method comprises the steps of storing a memory, a processor and a control program of the frog-imitating robot, wherein the control program is stored in the memory and can run on the processor, and the control program of the frog-imitating robot is configured to realize the control method of the frog-imitating robot.
In order to achieve the above object, the present invention provides a storage medium having stored thereon a control program for a frog-like robot, the control program for a frog-like robot, when executed by a processor, performing the steps of the control method for a frog-like robot according to any one of the above.
The invention provides a control method of a frog-imitating robot, which comprises the following steps: acquiring state data of a robot and environment data of a position of the robot at the initial moment when the robot is detected to be in an operation period; determining target parameters input into a machine learning model according to the state data and the environment data, inputting the target parameters into the machine learning model, and outputting operation parameters of the operation period; and controlling the robot according to the operation parameters. The state data and the environment data of the robot at the initial moment of the operation period are used for identifying and sensing the environment condition and the self state of the robot, and the target parameters input into the machine learning model are determined and input into the machine learning model, so that the operation parameters required by the current period can be determined.
Drawings
FIG. 1 is a schematic diagram of a robot in a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a first embodiment of a control method of the frog-imitating robot of the present invention;
fig. 3 is a schematic flow chart of a control method of the frog-imitating robot according to a second embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic diagram of a robot structure of a hardware running environment according to an embodiment of the present invention.
As shown in fig. 1, the robot may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a sports apparatus 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The motion device 1003 may include a motor and the optional motion device 1003 may also be coupled to the communication bus 1002 by including a standard wired interface, wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., a wireless FIdelity (WI-FI) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) Memory or a stable Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Further, the motor may be a synchronous motor, an asynchronous motor, a dc motor, etc., and the dc motor may be controlled by proportional-integral-derivative (Proportion Integration Differentiation, PID) control, pulse width modulation (Pulse Width Modulation, PWM) control, etc. PID control is the most commonly used method, and the motor is controlled to move by carrying out feedback control on the speed, the position and other states of the motor and adjusting the voltage and the current of a motor power supply, so that the movement of the parts such as thighs, calves and the like of the frog-like robot is controlled, and the robot moves in water. The motor may be controlled by converting the operating parameters into PID controlled specific control parameters.
Those skilled in the art will appreciate that the configuration shown in fig. 1 is not limiting of the robot and may include more or fewer components than shown, or may combine certain components, or may be arranged in a different arrangement of components.
As shown in fig. 1, the memory 1005 as a storage medium may include an operating system, a data storage module, a network communication module, a user interface module, and a control program of the frog-like robot.
In the robot shown in fig. 1, the network interface 1004 is mainly used for data communication with other devices; the motion device 1003 is mainly used for realizing the motion of the robot; the processor 1001 and the memory 1005 in the robot of the present invention may be provided in the robot, and the robot calls the control program of the frog-like robot stored in the memory 1005 through the processor 1001, and executes the control method of the frog-like robot provided by the embodiment of the present invention.
The embodiment of the invention provides a control method of a frog-imitating robot, and referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of the control method of the frog-imitating robot.
In this embodiment, the control method of the frog-imitating robot includes:
step S10, acquiring state data of a robot and environment data of a position of the robot at the initial moment when the robot is detected to be in an operation period;
The robot of this embodiment refers to a frog-imitating robot or a robot similar to a frog movement, and the state data includes: the current position information at the initial time, the state data of the operation period before the initial time and the first target position in the operation period, wherein the current position information comprises the position coordinates of the trunk of the robot, the direction of the trunk, the moving direction and the moving speed of the robot, and the first target position is generally preset or updated after the last operation period is finished. In particular, the state data of the run cycle before the initial time may include the position coordinates of the trunk of the robot of the cycle previous to the initial time and the direction of the trunk. Specifically, when the robot moves in the water, the environmental data here are the direction of the water flow and the speed of the water flow, and certainly, the environmental data of the robot moving in other scenes are not limited, such as the inclination angle of the ground, the friction force between the ground and the frog, and the like. The initial time of the operation cycle refers to the time when the frog-like robot completes all the actions in one movement cycle and before the actions in the next movement cycle are not started. The run time period here may be preset by a user or a controller.
Step S20, determining target parameters input into a machine learning model according to the state data and the environment data, inputting the target parameters into the machine learning model, and outputting operation parameters of the operation period;
The target parameters refer to parameters input into the machine learning model, and the machine learning model can be decision trees, support vector machines, neural networks and other algorithms. The operation parameters are the rotation angle and the rotation speed of each motor on the robot at each time in the operation period.
And step S30, controlling the robot according to the operation parameters.
Controlling the operation of the motors of the robot according to the operation parameters so as to enable the robot to reach a first target position, for example: the robot comprises two motors arranged on the hip joints, the two motors respectively control the movement of the left thigh and the movement of the right thigh, the motors are respectively arranged on the left knee and the right knee, the movement of the left shank and the movement of the right shank are respectively controlled, and the rotation angles and the speeds of the four motors are respectively controlled according to the operation parameters so as to drive the movement of the trunk of the robot. Of course, the greater the number of motors provided on the torso, the greater the freedom of the thigh. The above examples are not intended to limit the number of motors provided on the robot.
In this embodiment, by determining the target parameters input to the machine learning model and inputting the target parameters to the machine learning model, the operation parameters required for the current period can be determined, and since the operation parameters are determined based on the environmental data, the robot operation can be accurately controlled in the dynamic water domain.
Further, based on the first embodiment, a second embodiment of the control method of the frog-like robot according to the present invention is proposed, in which, referring to fig. 3, the machine learning model includes a neural network model, and the state data includes: the method comprises the steps of inputting the target parameter into the machine learning model and outputting the running parameter of the running period, wherein the current position information, the historical position information and the historical running parameter of the initial moment are respectively the current position information and the running parameter of the adjacent running period before the initial moment, and the method further comprises the following steps:
step S201, inputting the history position information, the current position information and the environment data into the neural network model, and outputting calibration operation parameters;
The individual neurons of the neural network are arranged in layers. Each neuron is connected only to neurons of the upper layer, receives the output of the previous layer, and outputs to the next layer. The neural network comprises an input layer, a hidden layer and an output layer. The neural network model is obtained by training a training data set. The training data set includes: input features and output results. In this embodiment, specifically, the input features are: initial position, target position, water flow direction, water flow rate, direction of robot and speed. The output results are the angles and speeds of the various motors set by the robot during the run cycle. There is no limitation on the input features. Prior to step S201, further comprising: training the neural network model. The step of training the neural network model comprises: controlling the motors of the robot to operate in different modes within one operation period; recording the angle and speed of the motor in the running period, the initial position, the target position, the water flow direction, the water flow velocity, the body direction and the body speed of the robot as a training data set; and inputting the neural network training to obtain a neural network model.
In this embodiment, the historical position information may include, in addition to a specific position, a moving direction and a speed when the robot is at the position. And the position in the historical position information is used as the initial position, the current position information is used as the target position, the movement direction and speed in the historical position information are used as the body direction and body speed, the neural network model is input, and the calibration operation parameters are output.
Step S202, determining a loss value according to the historical operation parameters and the calibration operation parameters;
The historical operation parameter is a parameter for controlling the operation of the robot in the previous period of the initial time, and the historical operation parameter controls the robot to move to the historical target position corresponding to the previous period, so that it can be easily understood that the position of the current position information is the historical target position when the neural network model outputs the correct historical operation parameter. And when the position of the current position information is not the historical target position, determining that the historical operation parameters output by the neural network model have errors, namely the neural network model needs to be adjusted. In this embodiment, the historical operation parameter is not a parameter as an error to be determined, but a parameter as a parameter to be checked with the calibration operation parameter, and since the robot can be controlled to reach the current position by the historical operation parameter, the historical operation parameter is a correct parameter from the position of the historical position information to the current position. Specifically, a loss value is determined by inputting the historical operating parameters and the calibration operating parameters into a loss function. The loss function here may be a mean square error, a cross entropy error. In this embodiment, the loss function is:
Where E is a loss value, n is the number of motors, k i is an error coefficient corresponding to the ith motor, and M i is an error value of the ith motor.
The calculation mode of M i is as follows:
Where T is the total duration of the robot operation cycle, l it is the operation parameter value of the ith motor at the T-th moment in the historical operation parameters, z it is the operation parameter value of the ith motor at the T-th moment in the calibration operation parameters, the parameters of l it and z it are the same, and the operation parameter value can be calculated by multiplying the angle value of the motor at a certain moment by the speed value of the motor. In this embodiment, the data type of the specific operation parameter value is not limited, and the control parameters of the motors of different robots may be different, for example, when the parameters of the motors have angle values, the l it and the z it may be angle parameters of the motors, i.e. l it is the angle value of the ith motor in the historical operation parameters at the t-th moment, and z it is the angle value of the ith motor in the calibration operation parameters at the t-th moment.
It should be noted that k i is related to the distance value between the motor and the body of the robot, and k i is inversely related to the distance value, so that the influence of the motor at different positions on the motion of the robot can be reduced by setting different k i on different motors. k i is inversely related to the change in direction and speed of the water flow in the total duration of the robot operating cycle, the lower the change, the greater the value of k i. Therefore, the influence of the environment change on the actual loss value is avoided, and the accuracy of the loss value is improved.
And step S203, adjusting the neural network model according to the loss value.
Specifically, whether the neural network model needs to be adjusted is determined according to the magnitude of the loss value, when the neural network model needs to be adjusted according to the loss value, a back propagation algorithm is used for calculating the gradient of a loss function relative to model parameters, and the weight and the bias of the model are updated by gradient descent, so that the neural network model is adjusted. The gradient descent may be random gradient descent, batch gradient descent, small batch gradient descent, or the like.
In this embodiment, the historical position information, the current position information and the environmental data are input into the neural network model, the calibration operation parameters are output, the loss value is determined by using the historical operation parameters, whether an output result of the neural network model has an error or not can be determined, and the neural network model is adjusted by the loss value, so that the accuracy of the output operation parameters is improved, and the accuracy of the robot motion is improved.
Further, based on the first or second embodiment, a third embodiment of the control method for a frog-like robot according to the present invention is provided, in which the machine learning model is ensemble learning, and the ensemble learning includes: at least two individual learners and a combining strategy, the step of inputting the target parameters into the machine learning model and outputting the operating parameters of the operating cycle comprising:
Inputting the target parameters into at least two individual learners to obtain at least two individual output results;
determining the operating parameter based on the combination strategy and at least two individual output results.
It should be noted that, the embodiment does not conflict with the second embodiment, the individual learner in the integrated learning is a machine learning model, different individual learners can be obtained by training with the same neural network according to different training data sets, and specifically, the individual learner is the neural network model in the second embodiment, which can not only receive the target parameter, but also output a corresponding result. While the type of the individual learner is not limited in the case of the first embodiment, it may be a decision tree, a support vector machine, a neural network, or the like. The combination strategy herein specifically processes at least two individual output results to determine accurate operating parameters.
In this embodiment, the operation parameters may be determined by an ensemble learning method in combination with results obtained by different individual learners. Thereby improving the accuracy of the operation parameters and further improving the accuracy of robot control.
Further, based on the third embodiment, a fourth embodiment of the control method for a frog-like robot according to the present invention is provided, in this embodiment, the state data includes current position information of the initial time, a first target position, and a second target position of a period before an operation period corresponding to the initial time, and before the step of inputting the target parameter into the machine learning model and outputting the operation parameter of the operation period, the method further includes:
when the distance between the current position information and the second target position is greater than a preset distance threshold value, updating the first target position and/or adjusting the combination strategy;
and when the distance between the current position information and the second target position information is smaller than or equal to a preset distance threshold value, determining to execute the step of inputting the target parameters into the machine learning model and outputting the operation parameters of the operation period.
The difference between the first target position and the second target position is that the running period corresponding to the first target position is after and adjacent to the running period corresponding to the second target position, where the distance between the current position information and the second target position refers to a straight line distance, and the first target position. The step of updating the first target position includes: and determining an updated first target position according to the current position information, the environment data and the moving capability of the robot.
In this embodiment, whether to change the first target position is determined according to the distance between the current position information and the second target position, so that in each operation period of the robot, a suitable target position can be obtained, thereby avoiding that the position deviation at the end of the previous period affects the operation parameters of the robot controlled in the next period, and improving the accuracy of the operation of the robot.
Further, based on the fourth embodiment, a fifth embodiment of the control method for a frog-like robot according to the present invention is provided, in this embodiment, the combining strategy is a predicted value weighting method, and the adjusting the combining strategy includes:
According to whether the environment data is consistent with the historical environment data of the previous period of the operation period corresponding to the initial moment;
Updating corresponding weighting weights of at least two of the individual learners when the environmental data and the historical environmental data are inconsistent;
and updating the corresponding weighting weights of all the individual learners when the environment data is consistent with the historical environment data.
The predictive value weighting method multiplies the output result of each model by the corresponding weight, and then adds the weighted predictive values to obtain the final operation parameters. The environmental data and the historical environmental data are identical in kind, specifically, the environmental data are the direction of water flow and the flow velocity of water flow, the historical environmental data are the direction of water flow and the flow velocity of water flow, whether the direction of water flow of the environmental data is identical with the direction of water flow of the historical environmental data is determined, and whether the value of the flow velocity of water flow of the environmental data is equal to the value of the flow velocity of the historical environmental data is determined. When the weighting weight is adjusted, whether the output result of the model is normal is determined through cross verification and a leave-out method, and the corresponding weight is adjusted. In other embodiments, the binding strategy is a learning strategy, and specifically, the adjusting the binding strategy includes: determining an evaluation value of a previous operation parameter of a period before the period corresponding to the initial moment according to the distance between the current position information and the second target position information; taking an individual output result of each individual learner in a period previous to the period corresponding to the evaluation value and the initial time as a new training sample, and adding the new training sample into a training set of the learning strategy; and retraining the learning strategy according to the training set.
In this embodiment, the weighting weight of the individual learner is updated by comparing the environmental data with the historical environmental data, thereby improving the accuracy of the output operation parameters.
Further, based on any one of the above embodiments, a sixth embodiment of the control method for a frog-like robot according to the present invention is provided, in which the state data includes a moving speed, a moving direction, current position information at the initial time, and first target position information of the robot, the environment data includes a water flow direction and a water flow speed of a water flow, and the step of determining the target parameters input into the machine learning model according to the state data and the environment data includes:
And determining the target parameters as the moving speed, the moving direction, the current position information, the first target position information, the water flow direction and the water flow speed according to the state data and the environment data.
In this embodiment, the target parameter is determined to correspond to an input parameter required by the machine learning model based on the state data and the environment data. Specifically, the state data and the environment data are preprocessed, and the state data and the environment data are normalized, for example: and determining coordinates corresponding to the current position information and the first target position information based on the same coordinate system. The target parameters are input into the input layer of the neural network model, and as the number of neurons of the target parameters with 6 corresponding input layers can be 6, the input layer transmits the parameters to the hidden layer, the hidden layer type comprises a full connection layer, a convolution layer, a circulation layer and the like, and finally the hidden layer transmits the parameters to the output layer to output the operation parameters.
In this embodiment, the target parameter is determined by the state data and the environment data, so that the accuracy of the target parameter can be ensured and the target parameter is adapted to the environment, thereby improving the accuracy of the output operation parameter.
Further, based on any one of the above embodiments, a seventh embodiment of the control method for a frog-like robot according to the present invention is provided, wherein before the step of acquiring the state data of the robot and the environmental parameter data of the position of the robot at the initial time of detecting that the robot is in the operation cycle, the method further includes:
and determining the number of the running periods and the target position of each running period according to the starting position, the ending position and the initial environment data of the robot.
Specifically, the starting position, the end position and the initial environment data determine a running path of the robot, the running path is divided into a plurality of sub-running paths, each sub-running path corresponds to one running period, and the end point of the sub-running path is used as a target position, namely, the first target position in the embodiment. Further, after the step of controlling the robot according to the operation parameters, it is determined whether an end position is reached, and when the end position is not reached, a new operation cycle is started, and the step S10 is executed again.
In this embodiment, the number of the operation periods and the target position of each operation period are determined according to the starting position, the end position and the initial environmental data of the robot, the number of the operation periods is inversely related to the length of the operation period, and the length of the operation period can be effectively limited.
In addition, the embodiment of the invention also provides a control device of the frog-imitating robot, which comprises:
the detection module is used for acquiring state data of the robot and environment data of the position of the robot when the robot is detected to be at the initial moment of the running period;
the computing module is used for determining target parameters input into a machine learning model according to the state data and the environment data, inputting the target parameters into the machine learning model and outputting the operation parameters of the operation period;
and the control module is used for controlling the robot according to the operation parameters.
In addition, the embodiment of the invention also provides a robot, which comprises: the control program of the frog-imitating robot is configured to implement the steps of the control method embodiment of the frog-imitating robot described in any one of the above.
In addition, the embodiment of the invention also provides a storage medium, wherein the storage medium is stored with a control program of the frog-imitating robot, and the control program of the frog-imitating robot realizes the steps of the control method embodiment of any frog-imitating robot when being executed by a processor.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (10)

1. The frog-imitating robot control method is characterized by comprising the following steps of:
acquiring state data of a robot and environment data of a position of the robot at the initial moment when the robot is detected to be in an operation period;
Determining target parameters input into a machine learning model according to the state data and the environment data, inputting the target parameters into the machine learning model, and outputting operation parameters of the operation period;
and controlling the robot according to the operation parameters.
2. The method of claim 1, wherein the machine learning model comprises a neural network model, and the state data comprises: the method comprises the steps of inputting the target parameter into the machine learning model and outputting the running parameter of the running period, wherein the current position information, the historical position information and the historical running parameter of the initial moment are respectively the current position information and the running parameter of the adjacent running period before the initial moment, and the method further comprises the following steps:
inputting the historical position information, the current position information and the environmental data into the neural network model, and outputting calibration operation parameters;
determining a loss value based on the historical operating parameter and the calibration operating parameter;
And adjusting the neural network model according to the loss value.
3. The frog-imitating robot control method of claim 1, wherein the machine learning model is ensemble learning, the ensemble learning comprising: at least two individual learners and a combining strategy, the step of inputting the target parameters into the machine learning model and outputting the operating parameters of the operating cycle comprising:
Inputting the target parameters into at least two individual learners to obtain at least two individual output results;
determining the operating parameter based on the combination strategy and at least two individual output results.
4. The frog-like robot control method according to claim 3, wherein the state data includes current position information at the initial time, first target position information, and second target position information of a period preceding an operation period corresponding to the initial time, and the step of inputting the target parameter into the machine learning model and outputting the operation parameter of the operation period further includes, before the step of outputting the operation parameter of the operation period:
when the distance between the current position information and the second target position information is larger than a preset distance threshold value, updating the first target position and/or adjusting the combination strategy;
and when the distance between the current position information and the second target position information is smaller than or equal to a preset distance threshold value, determining to execute the step of inputting the target parameters into the machine learning model and outputting the operation parameters of the operation period.
5. The method of claim 4, wherein the combination strategy is a predictive value weighting method, and the adjusting the combination strategy comprises:
According to whether the environment data is consistent with the historical environment data of the previous period of the operation period corresponding to the initial moment;
Updating corresponding weighting weights of at least two of the individual learners when the environmental data and the historical environmental data are inconsistent;
and updating the corresponding weighting weights of all the individual learners when the environment data is consistent with the historical environment data.
6. The frog-like robot control method according to claim 1, wherein the state data includes a moving speed, a moving direction of the robot, current position information at the initial time, and first target position information, the environmental data includes a water flow direction and a water flow speed of a water flow, and the step of determining target parameters input into a machine learning model based on the state data and the environmental data includes:
And determining the target parameters as the moving speed, the moving direction, the current position information, the first target position information, the water flow direction and the water flow speed according to the state data and the environment data.
7. The control method of the frog-imitation robot according to any one of claims 1 to 6, further comprising, before the step of acquiring the state data of the robot and the environmental parameter data of the position of the robot at the initial time of detecting that the robot is in the operation cycle:
and determining the number of the running periods and the target position of each running period according to the starting position, the ending position and the initial environment data of the robot.
8. A frog-imitating robot control device, characterized in that the frog-imitating robot control device comprises:
the detection module is used for acquiring state data of the robot and environment data of the position of the robot when the robot is detected to be at the initial moment of the running period;
the computing module is used for determining target parameters input into a machine learning model according to the state data and the environment data, inputting the target parameters into the machine learning model and outputting the operation parameters of the operation period;
and the control module is used for controlling the robot according to the operation parameters.
9. A robot, the robot comprising: a memory, a processor and a control program of a frog-like robot stored on the memory and operable on the processor, the control program of the frog-like robot being configured to implement the steps of the frog-like robot control method as claimed in any one of claims 1 to 7.
10. A storage medium, wherein a control program of the frog-imitating robot is stored on the storage medium, and the control program of the frog-imitating robot, when executed by a processor, realizes the steps of the control method of the frog-imitating robot according to any one of claims 1 to 7.
CN202410163775.4A 2024-02-05 2024-02-05 Frog-like robot control method, frog-like robot control device, frog-like robot, and storage medium Pending CN117961899A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410163775.4A CN117961899A (en) 2024-02-05 2024-02-05 Frog-like robot control method, frog-like robot control device, frog-like robot, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410163775.4A CN117961899A (en) 2024-02-05 2024-02-05 Frog-like robot control method, frog-like robot control device, frog-like robot, and storage medium

Publications (1)

Publication Number Publication Date
CN117961899A true CN117961899A (en) 2024-05-03

Family

ID=90852806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410163775.4A Pending CN117961899A (en) 2024-02-05 2024-02-05 Frog-like robot control method, frog-like robot control device, frog-like robot, and storage medium

Country Status (1)

Country Link
CN (1) CN117961899A (en)

Similar Documents

Publication Publication Date Title
US10148211B2 (en) Machine learning apparatus and method for learning correction value in motor current control, correction value computation apparatus including machine learning apparatus and motor driving apparatus
JP6506219B2 (en) Machine learning device, motor control device and machine learning method for learning current command of motor
US10692018B2 (en) Machine learning device and machine learning method for learning optimal object grasp route
JP6193961B2 (en) Machine learning device and method for optimizing the smoothness of feed of a machine feed shaft, and motor control device equipped with the machine learning device
JP6063016B1 (en) Machine learning method and machine learning device for learning operation command for electric motor, and machine tool provided with the machine learning device
EP1717649B1 (en) Learning control apparatus, learning control method, and computer program
US20220366245A1 (en) Training action selection neural networks using hindsight modelling
US8725294B2 (en) Controlling the interactive behavior of a robot
US20150306768A1 (en) Simulation device for plural robots
Parhi et al. Navigational control of several mobile robotic agents using Petri-potential-fuzzy hybrid controller
JP2007018490A (en) Behavior controller, behavior control method, and program
US20190317472A1 (en) Controller and control method
Al Dabooni et al. Heuristic dynamic programming for mobile robot path planning based on Dyna approach
US20200133273A1 (en) Artificial neural networks having competitive reward modulated spike time dependent plasticity and methods of training the same
Lin et al. An ensemble method for inverse reinforcement learning
CN111830822A (en) System for configuring interaction with environment
WO2021245286A1 (en) Learning options for action selection with meta-gradients in multi-task reinforcement learning
CN114047745B (en) Robot motion control method, robot, computer device, and storage medium
CN117961899A (en) Frog-like robot control method, frog-like robot control device, frog-like robot, and storage medium
KR102376615B1 (en) Method for controlling mobile robot and apparatus thereof
CN111984000A (en) Method and device for automatically influencing an actuator
US20220113724A1 (en) Information processing device, robot system, and information processing method
US11738454B2 (en) Method and device for operating a robot
Provost et al. Self-organizing distinctive state abstraction using options
Sun et al. Unmanned aerial vehicles control study using deep deterministic policy gradient

Legal Events

Date Code Title Description
PB01 Publication