CN114675641A

CN114675641A - Unmanned equipment control method and device and electronic equipment

Info

Publication number: CN114675641A
Application number: CN202210202495.0A
Authority: CN
Inventors: 张羽; 王弘毅; 熊方舟; 周奕达; 丁曙光; 任冬淳
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2022-03-03
Filing date: 2022-03-03
Publication date: 2022-06-28

Abstract

The embodiment of the specification first determines an interaction point existing between a target obstacle and the unmanned equipment, and predicts a driving strategy adopted by the target obstacle through the interaction point under the condition of considering the motion state of the unmanned equipment according to the motion state of the target obstacle at the current moment. Determining a first control parameter adopted by the target barrier to go to the interaction point according to the predicted driving strategy, determining a second control parameter of the unmanned equipment according to the first control parameter, and controlling the unmanned equipment based on the second control parameter. In the method, the target barrier and the unmanned equipment are continuously interacted in the process of going to the interaction point, and respective control parameters are adjusted, so that the collision risk between the unmanned equipment and the target barrier can be reduced, and the driving safety of the unmanned equipment is improved.

Description

Unmanned equipment control method and device and electronic equipment

Technical Field

The present disclosure relates to the field of unmanned driving, and in particular, to a method and an apparatus for controlling an unmanned device, and an electronic device.

Background

In the field of unmanned driving, the unmanned device needs to determine the motion state of the unmanned device according to the change of the surrounding environment of the unmanned device, so as to control the unmanned device to safely drive.

In the prior art, when the unmanned device is in a scene such as a road junction, a lane merge, or the like, the future motion state of other vehicles can be predicted from the historical motion state of other vehicles around the unmanned device. Then, based on the future motion state of the other vehicle, the motion state of the unmanned device itself is determined. Wherein the motion state may include: acceleration, speed, direction of travel, position, etc.

However, when the unmanned device predicts the future motion states of other vehicles, only the historical motion states of other vehicles are considered, and the interaction behaviors between the unmanned device and other vehicles are not considered, so that the predicted future motion states of other vehicles are inaccurate, the motion state of the unmanned device is inaccurate, and the driving safety of the unmanned device is reduced.

Disclosure of Invention

The embodiments of the present specification provide a method and an apparatus for controlling an unmanned device, and an electronic device, so as to partially solve the problems in the prior art.

The embodiment of the specification adopts the following technical scheme:

the present specification provides a control method for an unmanned aerial vehicle, including:

determining a target obstacle having an interaction behavior with unmanned equipment, and predicting a position point where the target obstacle interacts with the unmanned equipment to serve as an interaction point;

predicting a driving strategy adopted by the target obstacle through the interaction point under the condition that the motion state of the unmanned equipment at the current moment is considered according to the motion state of the target obstacle at the current moment;

determining a first control parameter adopted in the process that the target barrier moves to the interaction point according to the driving strategy;

and determining a second control parameter aiming at the unmanned equipment to pass through the interaction point according to the first control parameter, and controlling the unmanned equipment based on the second control parameter.

Optionally, determining a target obstacle having an interactive behavior with the unmanned device specifically includes:

determining a planned path corresponding to unmanned equipment and a predicted path corresponding to each obstacle in a preset range of the unmanned equipment;

judging whether the planned path and a predicted path corresponding to each obstacle have an intersection or not;

And determining a target obstacle having an interactive behavior with the unmanned equipment according to the judgment result.

Optionally, determining, according to the determination result, a target obstacle having an interactive behavior with the unmanned device includes:

if the planned path and a predicted path corresponding to the obstacle have an intersection point, judging whether the obstacle is the obstacle with interactive behavior with the unmanned equipment or not according to the time length when the unmanned equipment passes through the intersection point and the time length when the obstacle passes through the intersection point;

if yes, the obstacle is determined to be a target obstacle with interactive behaviors with the unmanned device.

Optionally, predicting, according to the motion state of the target obstacle at the current time, a driving strategy adopted by the target obstacle through the interaction point in consideration of the motion state of the unmanned device at the current time specifically includes:

inputting the predicted motion state of the target obstacle at the current moment predicted at the last moment and the real motion state of the target obstacle at the current moment into a prediction model, determining rewards corresponding to each driving strategy according to the difference between the predicted motion state and the real motion state through the prediction model, and predicting the driving strategy adopted by the target obstacle through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment according to the rewards corresponding to each driving strategy.

Optionally, determining a reward corresponding to each driving strategy according to a difference between the predicted motion state and the actual motion state, specifically including:

for each driving strategy, determining at least one reward index corresponding to the driving strategy according to the difference between the predicted motion state and the actual motion state corresponding to the driving strategy, wherein the at least one reward index comprises: a safety reward index, a speed reward index and a position reward index;

and determining the reward corresponding to the driving strategy according to the at least one reward index.

Optionally, predicting, according to the reward corresponding to each driving strategy, a driving strategy adopted by the target obstacle through the interaction point in consideration of the motion state of the unmanned aerial vehicle at the current time, specifically includes:

determining conditional probability of executing a real driving strategy by the target obstacle according to the reward corresponding to each driving strategy, wherein the real driving strategy is a driving strategy adopted by the target obstacle to reach the real motion state;

determining the actual probability of executing each driving strategy by the target obstacle at the current moment according to the conditional probability and the predetermined initial probability of executing each driving strategy by the target obstacle;

And predicting the driving strategy adopted by the target barrier through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment according to the actual probability of executing each driving strategy by the target barrier at the current moment.

Optionally, the method further comprises:

updating the initial probability of executing each driving strategy on the target obstacle according to the actual probability of executing each driving strategy on the target obstacle at the current moment, re-determining the updated probability as the initial probability, and predicting the driving strategy adopted by the subsequent target obstacle passing through the interaction point according to the re-determined initial probability.

Optionally, determining an initial probability of the target obstacle executing each driving strategy specifically includes:

when the target obstacle is determined to be an obstacle with interactive behavior with the unmanned aerial vehicle for the first time, determining the initial probability of executing each driving strategy by the target obstacle according to the time length of the target obstacle passing through the interaction point, the time length of the unmanned aerial vehicle passing through the interaction point and a preset traffic rule.

Optionally, the determining a first control parameter adopted in the process that the target obstacle moves to the interaction point according to the driving strategy specifically includes:

If the driving strategy is a yielding driving strategy, determining a traffic difference corresponding to the target barrier according to the difference between the time length of the target barrier for executing the driving strategy to pass through the interaction point and the time length of the target barrier for normally passing through the interaction point;

determining a distance difference corresponding to the target barrier according to a distance difference between the target barrier and the interaction point after the target barrier executes the driving strategy for the target duration; the target duration is the duration that the unmanned equipment normally passes through the interaction point;

and determining a first control parameter adopted in the process that the target barrier moves to the interaction point according to the driving strategy by taking the minimization of the traffic difference and the distance difference as a target.

Optionally, the determining a first control parameter used in the process that the target obstacle travels to the interaction point according to the driving strategy specifically includes:

and if the driving strategy is a preceding driving strategy, determining a preset control parameter as a first control parameter adopted in the process that the target barrier moves to the interaction point according to the driving strategy.

The present specification provides a control device for an unmanned aerial vehicle, including:

the system comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining a target barrier having an interactive behavior with the unmanned equipment, and predicting a position point where the target barrier interacts with the unmanned equipment to be used as an interaction point;

the prediction module is used for predicting a driving strategy adopted by the target barrier through the interaction point under the condition that the motion state of the unmanned equipment at the current moment is considered according to the motion state of the target barrier at the current moment;

the second determining module is used for determining a first control parameter adopted in the process that the target barrier moves to the interaction point according to the driving strategy;

and the control module is used for determining a second control parameter for the unmanned equipment to pass through the interaction point according to the first control parameter and controlling the unmanned equipment based on the second control parameter.

The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-described control method for an unmanned aerial device.

The electronic device provided by the specification comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the program to realize the control method of the unmanned device.

The embodiment of the specification adopts at least one technical scheme which can achieve the following beneficial effects:

in the embodiment of the specification, an interaction point existing between a target obstacle and the unmanned equipment is determined, and a driving strategy adopted by the target obstacle through the interaction point under the condition of considering the motion state of the unmanned equipment is predicted according to the motion state of the target obstacle at the current moment. And determining a first control parameter adopted by the target barrier to go to the interaction point according to the predicted driving strategy, determining a second control parameter of the unmanned equipment according to the first control parameter, and controlling the unmanned equipment based on the second control parameter. In the method, the target barrier and the unmanned equipment are continuously interacted in the process of going to the interaction point, so that respective control parameters are adjusted, the collision risk between the unmanned equipment and the target barrier can be reduced, and the driving safety of the unmanned equipment is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification and are incorporated in and constitute a part of this specification, illustrate embodiments of the specification and together with the description serve to explain the principles of the specification and not to limit the specification in a limiting sense. In the drawings:

Fig. 1 is a schematic flow chart of a control method for an unmanned aerial vehicle provided in an embodiment of the present specification;

fig. 2 is a schematic view of an interaction scene, which is provided in an embodiment of the present specification and takes an intersection as an example;

fig. 3 is a schematic structural diagram of a control device of an unmanned aerial vehicle provided in an embodiment of the present specification;

fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present specification.

Detailed Description

In the field of unmanned driving, one of the cores of unmanned technology is: accurately collecting road environment information and making reasonable driving decisions accordingly. Among them, the behavior of the unmanned vehicle needs to take into account the state (information such as speed, direction, position, acceleration) of another vehicle. Therefore, it is necessary to predict the movement path of the other vehicle for a future period of time. When the movement path of other vehicles may conflict with the unmanned vehicle, the unmanned vehicle needs to design its own driving pattern following the traffic regulations and imitating the driving habits of human drivers. Under the condition of avoiding the worst vehicle collision risk, the stability and the flexibility of the unmanned vehicle are improved.

During intersection or lane merging, the unmanned vehicle and other vehicles are in a strong interaction scenario, i.e., the behavior and intent of the unmanned vehicle and other vehicles are interacting.

In a strong interaction scene, in the prior art, a motion path of another vehicle in a future period of time is predicted according to historical information of the other vehicle, and then the unmanned vehicle responds to the motion path predicted by the other vehicle, that is, the unmanned vehicle adjusts the driving behavior of the unmanned vehicle.

However, in the prior art, short-term historical information of other vehicles is calculated and integrated into instant information at the current moment, which is equivalent to static scene decision making at a high frequency. In this way, the unmanned vehicle can only respond to the behavior of other vehicles passively, and cannot take into account the feasible strategy change of the drivers of other vehicles, so that the driving behavior of the unmanned vehicle lacks flexibility and initiative. In addition, in the prior art, the driving style of the driving behaviors of other vehicles cannot be effectively considered by the unmanned vehicle, mutual waiting phenomena are easy to occur for conservative other vehicles, and collision risks are easy to occur for aggressive other vehicles.

The control method of the unmanned equipment aims to predict the driving strategy of other vehicles interacting with the unmanned equipment through a Bayesian online learning model, and then the unmanned equipment determines the driving strategy of the unmanned equipment according to the predicted driving strategy of other vehicles.

To make the objects, technical solutions and advantages of the present specification clearer and more complete, the technical solutions of the present specification will be described in detail and completely with reference to the specific embodiments of the present specification and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without making any creative effort belong to the protection scope of the present specification.

The technical solutions provided by the embodiments of the present description are described in detail below with reference to the accompanying drawings.

Fig. 1 is a schematic flow chart of a control method for an unmanned aerial vehicle provided in an embodiment of the present specification, including:

s100: determining a target obstacle having an interaction behavior with the unmanned device, and predicting a position point where the target obstacle interacts with the unmanned device as an interaction point.

In the embodiment of the present specification, the unmanned device may be an unmanned vehicle, that is, an unmanned vehicle, and the unmanned device may be applied to the logistics distribution field, including both the immediate distribution field such as takeaway and distribution, and other non-immediate distribution fields. But also to manned services.

In this embodiment, the unmanned aerial vehicle may acquire, by using a sensor mounted on the unmanned aerial vehicle, each obstacle and a motion state of each obstacle within a preset range of the unmanned aerial vehicle. Wherein, the sensor can include: vision sensors, lidar, sensors for positioning, and the like. The motion state of the obstacle may include: speed, direction of travel, longitudinal and latitudinal positions, lane position, acceleration, and the like. The obstacle may be a manned vehicle.

After each obstacle in a preset range of the unmanned device is obtained, a prediction path corresponding to each obstacle is predicted according to the historical motion state of each obstacle obtained by the unmanned device. And then, planning a planned path corresponding to the unmanned equipment for the unmanned equipment according to the predicted path corresponding to each obstacle. And selecting the obstacles with interactive behaviors with the unmanned equipment from the obstacles in the preset range of the unmanned equipment as target obstacles according to the planned path corresponding to the unmanned equipment and the predicted path corresponding to each obstacle. And predicting a position point where the target obstacle interacts with the unmanned equipment as an interaction point. The target obstacle having an interactive behavior with the unmanned aerial vehicle is an obstacle that collides with the unmanned aerial vehicle during driving. And a location point where there is a conflict between the unmanned aerial device and the target obstacle, that is, a location point where the target obstacle interacts with the unmanned aerial device, that is, an interaction point.

When a target obstacle having an interactive behavior with the unmanned aerial vehicle is selected from the obstacles in the preset range of the unmanned aerial vehicle, a planned path corresponding to the unmanned aerial vehicle and a predicted path corresponding to each obstacle in the preset range of the unmanned aerial vehicle can be obtained. And judging whether the planned path and the predicted path corresponding to the obstacle have an intersection or not according to the predicted path corresponding to each obstacle. And determining a target obstacle having an interactive behavior with the unmanned equipment from the obstacles according to the judgment result.

Specifically, if there is no intersection between the planned path and the predicted path corresponding to the obstacle, it is determined that the obstacle is not an obstacle having an interactive behavior with the unmanned device.

If the planned path and the predicted path corresponding to the obstacle have an intersection point, judging whether the obstacle is a target obstacle with interactive behavior with the unmanned equipment or not according to the time length of the unmanned equipment passing through the intersection point and the time length of the obstacle passing through the intersection point.

When judging whether the obstacle is a target obstacle with interactive behavior with the unmanned aerial vehicle or not according to the time length when the unmanned aerial vehicle passes through the intersection point and the time length when the obstacle passes through the intersection point, judging whether the obstacle is the obstacle with interactive behavior with the unmanned aerial vehicle or not according to the difference between the time length when the unmanned aerial vehicle passes through the intersection point and the time length when the obstacle passes through the intersection point, and obtaining the judgment result of the obstacle. And if the difference between the time length when the unmanned equipment passes through the intersection point and the time length when the obstacle passes through the intersection point is not larger than a preset first time difference threshold value, determining that the obstacle is a target obstacle with interactive behavior with the unmanned equipment. And selecting the obstacle having interactive behavior with the unmanned equipment from the obstacles according to the judgment result of each obstacle as a target obstacle.

In addition, when whether the obstacle is a target obstacle with interactive behavior with the unmanned aerial vehicle is judged according to the time length when the unmanned aerial vehicle passes through the intersection point and the time length when the obstacle passes through the intersection point, a first difference between the time length when the unmanned aerial vehicle passes through the intersection point and the time length when the obstacle reaches the intersection point can be determined, and meanwhile, a second difference between the time length when the obstacle passes through the intersection point and the time length when the unmanned aerial vehicle reaches the intersection point can be determined. Then, from the first difference and the second difference, the smallest difference is selected as the interaction time difference. And finally, judging whether the obstacle is an obstacle with interactive behavior with the unmanned equipment or not according to the interactive time difference to obtain a judgment result corresponding to the obstacle. And if the interaction time difference is not larger than a preset second time difference threshold value, determining that the obstacle is a target obstacle with interaction behavior with the unmanned equipment. And selecting the obstacle having interactive behavior with the unmanned equipment from the obstacles as a target obstacle according to the judgment result of each obstacle.

When the time length that the unmanned equipment passes through the intersection point is determined, the time length that the unmanned equipment passes through the intersection point can be determined according to the length of the vehicle body of the unmanned equipment, the distance between the vehicle head of the unmanned equipment and the intersection point along the planned path, the width of the vehicle body of the obstacle and the instantaneous speed of the unmanned equipment.

The concrete formula is as follows:

wherein, T₁₁Indicates the time length of the unmanned device passing through the intersection point, s₁Representing the distance, l, between the head of the drone and the intersection point along the planned path₁Indicating the length of the body, w, of the unmanned aerial vehicle₂Vehicle body width, v, representing the obstacle₁Representing the instantaneous speed of the drone.

When the time length of the unmanned equipment reaching the intersection point is determined, the time length of the unmanned equipment reaching the intersection point can be determined according to the distance between the head of the unmanned equipment and the intersection point along the planned path and the instantaneous speed of the unmanned equipment.

The concrete formula is as follows:

wherein, T₁₂Indicating the time duration for the drone to reach the intersection.

When the time length of the obstacle passing through the intersection point is determined, the time length of the obstacle passing through the intersection point can be determined according to the length of the vehicle body of the obstacle, the distance between the vehicle head of the obstacle and the intersection point along the predicted path, the width of the vehicle body of the unmanned equipment and the instantaneous speed of the obstacle.

The concrete formula is as follows:

wherein, T₂₁Indicates the time length of the obstacle passing through the intersection point, s₂Representing the distance between the head of the obstacle to the intersection point along the predicted path, l₂Indicates the length of the body of the obstacle, w₁Indicating width of body, v, of unmanned equipment₂Indicating the instantaneous speed of the obstacle.

When the time length of the obstacle reaching the intersection point is determined, the time length of the obstacle reaching the intersection point can be determined according to the distance between the head of the obstacle and the intersection point along the predicted path of the obstacle and the instantaneous speed of the obstacle.

The concrete formula is as follows:

wherein, T₂₂Indicating the time duration for the obstacle to reach the intersection.

The formula for the first difference is: t is a unit of₁₁-T₂₂The formula of the second difference is: t is₂₁-T₁₂。

After the target obstacle having an interactive behavior with the unmanned aerial vehicle is determined, when an interaction point at which the target obstacle interacts with the unmanned aerial vehicle is predicted, a position point at which the target obstacle interacts with the unmanned aerial vehicle can be predicted as an interaction point according to a planned path of the unmanned aerial vehicle and a predicted path of the target obstacle. That is, the intersection between the planned path of the unmanned device and the predicted path of the target obstacle.

Based on the above description of determining the target obstacle and the interaction point, an interaction scene diagram, which is provided by the embodiment of the present specification and takes an intersection as an example, is shown in fig. 2.

In fig. 2, the planned path of the unmanned aerial vehicle is L1, the length of the vehicle body of the unmanned aerial vehicle is L1, the width of the vehicle body is w1, and the distance from the vehicle head to the interaction point is s 1. The predicted path of the target obstacle is L2, the length of the vehicle body of the target obstacle is L2, the width of the vehicle body is w2, and the distance from the vehicle head to the interaction point is s 2.

In addition, when the interaction point of the interaction between the unmanned equipment and the target obstacle is determined, the position point existing in the driving process of the unmanned equipment and the target obstacle along the predicted path in the driving process of the unmanned equipment along the planned path can be determined as the interaction point according to the length of the vehicle body of the unmanned equipment, the width of the vehicle body of the obstacle and the instantaneous speed of the unmanned equipment, and the length of the vehicle body of the target obstacle, the width of the vehicle body of the unmanned equipment and the instantaneous speed of the target obstacle.

It should be noted that the control method of the unmanned aerial vehicle shown in fig. 1 may be applied to a strong interaction scene such as an intersection or a merged lane, or may be applied to a simple road scene, and the control method of the unmanned aerial vehicle shown in fig. 1 may be applied to the unmanned aerial vehicle, or may be applied to a server that controls the unmanned aerial vehicle.

S102: and predicting a driving strategy adopted by the target barrier through the interaction point under the condition that the motion state of the unmanned equipment at the current moment is considered according to the motion state of the target barrier at the current moment.

In the embodiment of the present specification, after the target obstacle is determined, the driving strategy adopted by the target obstacle through the interaction point in consideration of the motion state of the unmanned aerial vehicle at the current time may be predicted according to the motion state of the target obstacle at the current time. That is, when the target obstacle detects that the unmanned aerial vehicle collides with its own travel route, and the target obstacle needs to determine its own travel strategy according to the current movement state of the unmanned aerial vehicle, the unmanned aerial vehicle predicts the travel strategy adopted by the target obstacle at the current time.

In this way, the gaming behavior between the target obstacle and the unmanned device, that is, the interaction behavior between the target obstacle and the unmanned device, may be determined according to the motion state of the target obstacle at the current time and the motion state of the unmanned device at the current time.

Wherein the motion state may include: speed, acceleration, position, direction of travel, etc. The driving strategy may include: a look-ahead driving strategy and a yield driving strategy.

In the case where the target obstacle is not an obstacle that is determined to have an interactive behavior with the unmanned aerial vehicle for the first time, the driving strategy used by the interaction point may be passed in the case where the motion state of the target obstacle at the current time of the unmanned aerial vehicle is considered at the current time, based on the predicted motion state of the target obstacle at the current time, which is predicted by the unmanned aerial vehicle at the previous time, and the actual motion state of the target obstacle at the current time. Wherein predicting the motion state may comprise: predicted speed, predicted acceleration, predicted position, predicted driving direction, etc., and the true motion state may include: true speed, true acceleration, true position, true direction of travel, and the like.

Specifically, the predicted motion state of the target obstacle at the current time, which is predicted by the unmanned device at the previous time, and the actual motion state of the target obstacle at the current time may be input to the prediction model, so that the reward corresponding to each driving strategy may be determined according to the difference between the predicted motion state and the actual motion state through the prediction model. The step of determining the reward corresponding to each driving strategy refers to determining a preference coefficient of the target obstacle for each driving strategy.

The unmanned equipment can predict the driving strategy adopted by the target barrier through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment according to the reward corresponding to the target barrier for each driving strategy (namely, according to the preference coefficient of the target barrier for each driving strategy); wherein the predictive model comprises: a Bayesian online learning model or a Bayesian online learning model combined with reinforcement learning. The prediction model may learn a driving style preferred by the target obstacle according to the predicted motion state of the target obstacle and the real motion state of the target obstacle, which are input in real time, and the driving style may include: and executing the yielding driving strategy and the advance driving strategy.

In determining the reward corresponding to each driving strategy according to the difference between the predicted motion state and the real motion state, for each driving strategy, at least one reward index (i.e. preference index) corresponding to the driving strategy can be determined according to the difference between the predicted motion state and the real motion state corresponding to the driving strategy, wherein the at least one reward index comprises: a security reward index (i.e., a security preference index), a speed reward index (i.e., a speed preference index), a location reward index (i.e., a location preference index). Then, according to at least one reward index corresponding to the type of driving strategy, a reward corresponding to the type of driving strategy, namely a preference coefficient of the target obstacle for the type of driving strategy is determined.

The safety reward index refers to the smallest acceleration in the predicted acceleration of the target obstacle in the predicted motion state and the acceleration of the motion state of the unmanned device at the current moment, the speed reward index refers to the difference between the predicted speed and the real speed, and the position reward index refers to the difference between the predicted position and the real position.

The formula of the safety reward index is as follows:

wherein v is_pExpressed as predicted speed, s_pDenoted as predicted position.

Expressed as the predicted acceleration of the target obstacle.

Expressed as the acceleration, v, of the unmanned device in the state of motion at the current moment_oRepresenting the velocity, s, of the unmanned device in the state of motion at the current moment_oIndicating the position of the drone in the motion state at the current time.

The formula for the speed reward indicator is: re_v＝v_r-v_p. Wherein v is_rExpressed as true speed.

The formula for the location reward index is: re_l＝s_r-s_p. Wherein s is_rRepresented as a true location.

When the reward corresponding to the driving strategy is determined according to at least one reward index corresponding to the driving strategy, the weight corresponding to each reward index corresponding to the driving strategy can be determined, and then, the weighting summation is carried out on each reward index according to the weight corresponding to each reward index, so that the reward corresponding to the driving strategy is obtained.

When the reward corresponding to the driving strategy is determined according to at least one reward index corresponding to the driving strategy, a plurality of groups of reward coefficients can be obtained or sampled, and the reward corresponding to the driving strategy of each group of reward coefficients is determined according to the group of reward coefficients and the at least one reward index corresponding to the driving strategy. Wherein each set of reward coefficients may represent, for each driving strategy, a weight of the target obstacle in preference to that driving strategy, that is, a weight of the target obstacle in preference to each reward index corresponding to that driving strategy, that is, a weight of the target obstacle in preference to each preference index corresponding to that driving strategy.

Specifically, based on the sub-coefficients in the group of reward coefficients, at least one reward index corresponding to the driving strategy is subjected to weighted summation to obtain the reward corresponding to the group of reward coefficients for the driving strategy. When the sub-system number in one group of reward coefficients follows normal distribution, each group of reward coefficients also follows normal distribution, and furthermore, the reward corresponding to the driving strategy of each group of reward coefficients also follows normal distribution.

Wherein, a group of reward coefficients may include sub-coefficients for the reward indicator, and each group of reward coefficients includes different sub-numbers. The number of the sub-systems aiming at the reward index can represent the weight of the reward index, and the number of the sub-systems aiming at the reward index is subjected to multivariate normal distribution, and the number of the multivariate normal distribution is determined by the number of the reward index. In addition, different driving strategies correspond to different reward indexes.

For example, the multiple sets of award coefficients may be obtained

θ₁，…，θ_nFor n sets of reward factors, θ_nIs the nth set of bonus factors. A set of reward factors includes sub-factors for the reward index, in theta_nFor example, when the number of bonus indexes is 2, θ_nObeying binary normal distribution, when the number of the reward indexes is 3, theta_nObey a ternary normal distribution, and so on. When the number of the reward indexes is 3, the nth group of prizesThe number of subsystems in the excitation coefficient may be: theta_n＝(θ_n ¹，θ_n ²，θ_n ³)。θ_n ¹Represented as the first sub-coefficient in the set of bonus coefficients.

Aiming at each driving strategy, the reward index corresponding to the driving strategy is assumed to be a safe reward index Re_sSpeed reward index Re_vPosition reward index Re_lThen theta_nThe reward corresponding to the driving strategy is as follows: re (I) _p|θ_n)＝θ_n ¹*Re_s+θ_n ²*Re_v+θ_n ³*Re_l. Wherein, I_pA driving strategy is shown, which in this example is shown.

After determining the reward corresponding to each driving strategy, the driving strategy adopted by the target obstacle through the interaction point under the condition of considering the motion state of the unmanned equipment at the current time can be predicted according to the reward corresponding to each driving strategy. That is, the driving strategy adopted by the target obstacle through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment is predicted according to the preference coefficient of the target obstacle to each driving strategy.

Specifically, the conditional probability of executing the real driving strategy by the target obstacle may be determined according to the reward corresponding to each driving strategy, where the real driving strategy is a driving strategy adopted when the target obstacle reaches the real motion state, and here, the driving strategy adopted by the target obstacle at the previous time is indicated. Then, an a posteriori probability of the target obstacle executing each of the travel strategies at the current time is determined as an actual probability of the target obstacle executing each of the travel strategies at the current time, based on the conditional probability and an initial probability of the target obstacle executing each of the travel strategies that is determined in advance. And predicting the driving strategy adopted by the target barrier through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment according to the actual probability of executing each driving strategy by the target barrier at the current moment.

The initial probability of the target obstacle executing each driving strategy may be the probability of the target obstacle determined at the previous time executing each driving strategy at the previous time, or the probability of the target obstacle initially executing each driving strategy. In addition, the probability of each driving strategy being executed by the target obstacle determined at the previous time, that is, the probability of each driving strategy being executed by the target obstacle updated at the previous time.

The formula for calculating the conditional probability of the target obstacle to execute the real driving strategy is as follows:

Re(I_r|θ_n) Represents a reward, Σ, for implementing a real driving strategy for a target obstacle_IRe(I|θ_n) Representing the reward of implementing all driving strategies for the target obstacle.

In order to improve the accuracy of the driving strategy predicted by the target obstacle at the current time, the driving strategy that the target obstacle may possibly adopt at the current time may be predicted by integrating the driving strategies that the target obstacle has historically adopted.

When the actual probability of the target obstacle executing each driving strategy at the current time is determined according to the initial probability of executing each driving strategy according to the conditional probability and the predetermined target obstacle, the actual probability of executing each driving strategy at the current time by the target obstacle can be determined according to the conditional probability corresponding to the actual driving strategy executed by the target obstacle at all historical times (including the previous time) and the probability of executing each driving strategy at the previous time by the target obstacle determined at the previous time. Here, the "real travel strategy" herein refers to a travel strategy adopted by the target obstacle at all the historical times.

The formula for calculating the actual probability of the target obstacle to execute each driving strategy at the current moment is as follows:

wherein, P (θ)_n|I_r) For the target obstacle at the current time t₀At theta_nPosterior probability of executing each driving strategy under reward coefficient, from t₀-R +1 to t₀The time of day is represented as all of the historical times,

the condition probability cumulative approximation that the target obstacle performs the real driving strategy at all the historical moments indicates that the target obstacle is at theta at the current moment_nAnd executing prior probability according with the real driving strategy under the reward coefficient. P (theta)_n) Indicating that the target obstacle determined for the previous time was at θ at the previous time_nThe probability of executing each driving strategy under the bonus factor, i.e. the initial probability. In addition, when θ is different, P (θ)_n|I_r)、

P(θ_n) And are also different.

In the case where the movement state of the target obstacle at the current time of the unmanned aerial vehicle is considered based on the actual probability of executing each driving strategy at the current time of the target obstacle, when the driving strategy adopted by the interaction point is passed, the driving strategy with the maximum actual probability may be determined as the driving strategy that the target obstacle may possibly adopt by the interaction point in the case where the movement state of the unmanned aerial vehicle is considered.

In addition, under the condition of obtaining multiple sets of reward coefficients, after determining the reward corresponding to each driving strategy for each set of reward coefficients, the conditional probability of executing the real driving strategy for the target obstacle corresponding to each set of reward coefficients can be determined according to the reward corresponding to each set of reward coefficients for each driving strategy. And aiming at each group of reward coefficients, determining the actual probability of executing each driving strategy by the target barrier corresponding to the group of reward coefficients at the current moment according to the conditional probability corresponding to the group of reward coefficients and the initial probability of executing each driving strategy by the target barrier corresponding to the group of reward coefficients. And predicting the driving strategy adopted by the target barrier through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment according to the actual probability of executing each driving strategy by the target barrier corresponding to each group of reward coefficients at the current moment.

Because the actual probability of each driving strategy executed by the target obstacle corresponding to each group of reward coefficients at the current moment and each group of reward coefficients obey the normal distribution, the actual probability of each driving strategy actually obeys the normal distribution for each driving strategy executed by the target obstacle at the current moment.

Therefore, when predicting a driving strategy that the target obstacle may adopt at the current time, the probability that the reward corresponding to the driving strategy is in the average reward may be determined for each driving strategy as the actual probability that the target obstacle adopts the driving strategy at the current time according to the actual probability distribution of the driving strategy and the reward of the driving strategy. And finally, predicting the driving strategy which is possibly adopted by the target obstacle at the current moment according to the actual probability corresponding to each driving strategy adopted by the target obstacle at the current moment, namely predicting the driving strategy adopted by the target obstacle through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment. That is, the travel strategy with the maximum actual probability is determined as the travel strategy adopted by the target obstacle through the interaction point in consideration of the motion state of the unmanned aerial vehicle at the present time.

In addition, the initial probability of executing each driving strategy on the target obstacle can be updated according to the actual probability of executing each driving strategy on the target obstacle at the current moment, the updated probability is determined as the initial probability again, and the driving strategy adopted by the subsequent target obstacle through the interaction point can be predicted according to the determined initial probability again.

Under the condition of acquiring a plurality of groups of reward coefficients, the initial probability of executing each driving strategy by the target obstacle follows normal distribution, so the initial probability distribution of executing each driving strategy by the target obstacle can be updated according to the actual probability distribution of executing each driving strategy by the target obstacle at the current moment, the updated probability distribution is re-determined as the initial probability distribution, and the driving strategy adopted by the subsequent target obstacle through the interaction point is predicted according to the re-determined initial probability distribution.

S104: and determining a first control parameter adopted in the process that the target barrier moves to the interaction point according to the driving strategy.

S106: and according to the first control parameter, determining a second control parameter aiming at the unmanned equipment to pass through the interaction point, and controlling the unmanned equipment based on the second control parameter.

In the embodiment of the present specification, after the driving strategy that the target obstacle has the highest probability of adopting at the present time is predicted, the driving strategy of the unmanned device itself may be determined according to the predicted driving strategy of the target obstacle. Specifically, the first control parameter used in the process of the target obstacle heading to the interaction point according to the predicted driving strategy may be predicted (determined). And then, the unmanned equipment determines a second control parameter aiming at the self passing interaction point of the unmanned equipment according to the first control parameter, and controls the unmanned equipment based on the second control parameter so as to enable the unmanned equipment to move. Wherein the first control parameter and the second control parameter may be acceleration, velocity, position, direction, etc. And if the driving strategy of the target obstacle is a concession driving strategy, the driving strategy of the unmanned equipment is a preceding driving strategy, and otherwise, if the driving strategy of the target obstacle is a preceding driving strategy, the driving strategy of the unmanned equipment is a concession driving strategy.

In the first case: when the driving strategy of the target obstacle is the advance driving strategy, the unmanned equipment predicts that the target obstacle drives at a constant speed or accelerates in the process of executing the advance driving strategy by the target obstacle. In this case, the unmanned aerial vehicle executes the concessional travel policy, and the unmanned aerial vehicle decelerates or stops traveling during execution of the concessional travel policy.

In the second case: when the driving strategy of the target obstacle is the yielding driving strategy, the unmanned equipment predicts that the target obstacle performs deceleration driving in the process of executing the preceding driving strategy. In this case, the unmanned equipment executes the advance travel maneuver, and the unmanned equipment executes the uniform speed or acceleration travel during the advance travel maneuver.

In the third case: when the actual probabilities of each driving strategy adopted by the target obstacle at the current moment are relatively close, the unmanned device cannot predict which driving strategy is adopted by the target obstacle at the current moment. In this case, the unmanned device predicts the target obstacle to execute the hybrid travel strategy.

For the first case: and if the driving strategy of the target obstacle is a preceding driving strategy, determining the preset control parameter as a first control parameter adopted in the process that the target obstacle moves to the interaction point according to the predicted driving strategy.

For the second case:

if the predicted driving strategy of the target obstacle is the concessional driving strategy, in order to enable the target obstacle to pass through the interaction point efficiently and safely under the condition of executing the concessional driving strategy, the acceleration of the target obstacle in the process of executing the concessional driving strategy needs to be predicted.

Specifically, the traffic difference corresponding to the target obstacle may be determined according to a difference between a time period during which the predicted driving strategy executed by the target obstacle passes through the interaction point and a time period during which the target obstacle normally passes through the interaction point. The normal passing interaction point may be a passing interaction point where the target obstacle is not affected by any vehicle, that is, the normal passing interaction point where the target obstacle passes may be a passing interaction point where the target obstacle performs a preceding driving maneuver.

The formula for calculating the passing difference corresponding to the target barrier is as follows:

wherein O is a look-ahead strategy, Y is a look-ahead strategy, EL^YThe traffic difference corresponding to the target obstacle is represented, and a is an acceleration (unknown).

And then, determining the distance difference corresponding to the target obstacle according to the distance difference between the target obstacle and the interaction point after the target obstacle executes the predicted driving strategy target duration. Wherein the target duration is the duration that the unmanned equipment normally passes through the interaction point.

The formula for calculating the distance difference corresponding to the target obstacle is as follows:

wherein, CL is^YExpressed as the difference in the corresponding spacing of the target obstacle,

indicates a time period for the drone to pass the interaction point while the drone executes the look-ahead strategy,

and the distance difference between the target obstacle and the interaction point after the time length of the unmanned equipment passing through the interaction point is represented.

And finally, determining a first control parameter adopted by the target barrier in the process of heading to the interactive point according to the predicted driving strategy by taking the traffic difference and the distance difference minimization as a target.

Wherein, the formula of the total difference of the traffic difference and the space difference is as follows: TL^Y＝EL^Y+CL^Y. It can be seen that TL^YIs a functional relation with respect to the acceleration a. And solving the acceleration a by using the minimization of the traffic difference and the distance difference as a solving target to obtain a first control parameter.

For the third case:

if the unmanned equipment cannot predict which driving strategy the target obstacle adopts at the current time, according to the actual probability of executing each driving strategy by the target obstacle at the current time, the first control parameters determined by executing each driving strategy by the target obstacle at the current time can be subjected to weighted summation to obtain the mixed control parameters of executing the driving strategy by the target obstacle at the current time.

The formula of the hybrid control parameter of the target obstacle executing the driving strategy at the current moment is as follows:

wherein, P₂(O) represents the actual probability, P, of executing the look-ahead maneuver for the target obstacle₂(Y) represents an actual probability when the concessional driving maneuver is executed for the target obstacle,

the first control parameter, which is indicative of a determination when a look-ahead maneuver is implemented for the target obstacle, may be an acceleration,

representing the first control parameter determined when the yielding driving strategy is executed for the target obstacle.

As can be seen from the method shown in fig. 1, the present specification first determines an interaction point existing between the target obstacle and the unmanned aerial vehicle, and predicts a driving strategy adopted by the target obstacle through the interaction point in consideration of the motion state of the unmanned aerial vehicle according to the motion state of the target obstacle at the current time. And determining a first control parameter adopted by the target barrier to go to the interaction point according to the predicted driving strategy, determining a second control parameter of the unmanned equipment according to the first control parameter, and controlling the unmanned equipment based on the second control parameter. In the method, the target barrier and the unmanned equipment are continuously interacted in the process of going to the interaction point, and respective control parameters are adjusted, so that the collision risk between the unmanned equipment and the target barrier can be reduced, and the driving safety of the unmanned equipment is improved.

Further, in steps S102 to S106 shown in fig. 1, when the target obstacle is initially determined as an obstacle that interacts with the unmanned aerial vehicle, the target obstacle does not have a predicted movement state of the target obstacle at the current time that was predicted at the previous time, and therefore, the probability that the target obstacle adopts each driving maneuver at the current time may be initialized. Then, according to the initialized probability of each driving strategy, determining a first control parameter of the target obstacle in the process of executing each driving strategy by the target obstacle at the current moment. And the unmanned equipment determines a second control parameter of the unmanned equipment according to the first control parameter.

Specifically, when the target obstacle is determined to be an obstacle having an interaction behavior with the unmanned aerial vehicle for the first time, the time length of the target obstacle passing through the interaction point may be determined according to the motion state of the target obstacle at the current time. And then, predicting a driving strategy adopted by the target barrier through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment according to the time length of the target barrier through the interaction point, the time length of the unmanned equipment through the interaction point and a preset traffic rule. The preset traffic rule refers to the priority of the target obstacle and the driving direction of the unmanned equipment. The priority of the driving direction in the traffic regulation is from high to low: straight line > left turn > right turn, the priority values correspond to 3>2>1, respectively.

That is, the initial probability of the target obstacle executing each driving strategy at the present time is determined according to the time length of the target obstacle passing through the interaction point, the time length of the unmanned device passing through the interaction point, and the priority of the driving direction between the target obstacle and the unmanned device. Then, the travel strategy with the largest initial probability is determined as the travel strategy executed by the target obstacle at the current time.

The calculation formula of the initial probability of the target obstacle executing the preceding driving strategy is as follows:

wherein, P₂(O) represents the probability of executing the preceding driving maneuver for the target obstacle, O represents the preceding driving maneuver, a represents the time weight, r represents the priority of different driving directions₂Priority, r, expressed as direction of travel of the target obstacle₁Priority, r, expressed as direction of travel of the drone₂-r₁Represents the priority of travel, and b represents the weight of the priority of travel. And the initial probability of the target obstacle executing the yielding driving strategy is P₂(Y)＝1-P₂(O). Y denotes a yield driving strategy.

In the formulas included in this specification, the subscript of the formula for describing the relevant content of the unmanned aerial vehicle is "1", the subscript of the formula for describing the relevant content of the obstacle or target obstacle is "2", the superscript of the formula for describing the advance travel strategy is "O", and the superscript of the formula for describing the yield travel strategy is "Y".

Based on the same idea, the present specification further provides a corresponding apparatus, a storage medium, and an electronic device.

Fig. 3 is a schematic structural diagram of a control apparatus of an unmanned aerial vehicle according to an embodiment of the present disclosure, where the apparatus includes:

the first determining module 301 is configured to determine a target obstacle having an interaction behavior with an unmanned device, and predict a position point where the target obstacle interacts with the unmanned device, where the position point is used as an interaction point;

the prediction module 302 is configured to predict, according to a motion state of the target obstacle at the current time, a driving strategy adopted by the target obstacle through the interaction point in consideration of the motion state of the unmanned device at the current time;

a second determining module 303, configured to determine a first control parameter that is used in a process that the target obstacle travels to the interaction point according to the driving policy;

a control module 304, configured to determine, according to the first control parameter, a second control parameter for the unmanned device to pass through the interaction point, and control the unmanned device based on the second control parameter.

Optionally, the first determining module 301 is specifically configured to determine a planned path corresponding to the unmanned aerial vehicle and a predicted path corresponding to each obstacle within a preset range of the unmanned aerial vehicle; judging whether an intersection point exists between the planned path and a predicted path corresponding to each obstacle; and determining a target obstacle having an interactive behavior with the unmanned equipment according to the judgment result.

Optionally, the first determining module 301 is specifically configured to, if there is an intersection between the planned path and the predicted path corresponding to the obstacle, determine whether the obstacle is an obstacle that has an interactive behavior with the unmanned aerial vehicle according to a duration of the unmanned aerial vehicle passing through the intersection and a duration of the obstacle passing through the intersection; if yes, the obstacle is determined to be a target obstacle with interactive behaviors with the unmanned device.

Optionally, the predicting module 302 is specifically configured to input the predicted motion state of the target obstacle at the current time and the actual motion state of the target obstacle at the current time, which are predicted at the previous time, into a prediction model, so as to determine, through the prediction model, a reward corresponding to each driving strategy according to a difference between the predicted motion state and the actual motion state, and predict, according to the reward corresponding to each driving strategy, a driving strategy that the target obstacle passes through the interaction point in consideration of the motion state of the unmanned aerial vehicle at the current time.

Optionally, the prediction module 302 is specifically configured to, for each driving strategy, determine at least one reward indicator corresponding to the driving strategy according to a difference between the predicted motion state and the actual motion state corresponding to the driving strategy, where the at least one reward indicator includes: a safety reward index, a speed reward index and a position reward index; and determining the reward corresponding to the driving strategy according to the at least one reward index.

Optionally, the prediction module 302 is specifically configured to determine, according to a reward corresponding to each driving strategy, a conditional probability that the target obstacle executes a real driving strategy, where the real driving strategy is a driving strategy adopted by the target obstacle to reach the real motion state; determining the actual probability of executing each driving strategy by the target obstacle at the current moment according to the conditional probability and the predetermined initial probability of executing each driving strategy by the target obstacle; and predicting the driving strategy adopted by the target barrier through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment according to the actual probability of executing each driving strategy by the target barrier at the current moment.

Optionally, the predicting module 302 is further configured to update an initial probability of executing each driving strategy on the target obstacle according to an actual probability of executing each driving strategy on the target obstacle at the current time, and re-determine the updated probability as the initial probability, so as to predict a driving strategy adopted by the subsequent target obstacle through the interaction point according to the re-determined initial probability.

Optionally, the prediction module 302 is specifically configured to, when the target obstacle is initially determined to be an obstacle having an interaction with the unmanned aerial vehicle, determine an initial probability of executing each driving strategy by the target obstacle according to a duration of the target obstacle passing through the interaction point, a duration of the unmanned aerial vehicle passing through the interaction point, and a preset traffic rule.

Optionally, the second determining module 303 is specifically configured to, if the driving strategy is a concessional driving strategy, determine a traffic difference corresponding to the target obstacle according to a difference between a time length for the target obstacle to execute the driving strategy to pass through the interaction point and a time length for the target obstacle to normally pass through the interaction point; determining a distance difference corresponding to the target obstacle according to the distance difference between the target obstacle and the interaction point after the target obstacle executes the driving strategy for the target duration; wherein the target duration is the duration that the unmanned equipment normally passes through the interaction point; and determining a first control parameter adopted in the process that the target barrier moves to the interaction point according to the driving strategy by taking the minimization of the traffic difference and the distance difference as a target.

Optionally, the second determining module 303 is specifically configured to, if the driving strategy is a preceding driving strategy, determine a preset control parameter as the first control parameter adopted in the process that the target obstacle moves to the interaction point according to the driving strategy.

The present specification also provides a computer-readable storage medium storing a computer program which, when executed by a processor, is operable to perform the method of controlling an unmanned aerial device provided in fig. 1 above.

Based on the control method of the unmanned device shown in fig. 1, an embodiment of the present specification further provides a schematic structural diagram of the electronic device shown in fig. 4. As shown in fig. 4, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, but may also include hardware required for other services. The processor reads a corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to implement the control method of the unmanned aerial vehicle described in fig. 1.

Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain a corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.

The systems, apparatuses, modules or units described in the above embodiments may be specifically implemented by a computer chip or an entity, or implemented by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various units by function, respectively. Of course, the functionality of the various elements may be implemented in the same one or more pieces of software and/or hardware in the practice of this description.

As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The description has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only an example of the present disclosure, and is not intended to limit the present disclosure. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims

1. A method of controlling an unmanned aerial device, comprising:

determining a target obstacle having an interactive behavior with the unmanned equipment, and predicting a position point where the target obstacle interacts with the unmanned equipment as an interaction point;

2. The method of claim 1, wherein determining a target obstacle for interactive behavior with the drone specifically comprises:

3. The method of claim 2, wherein determining a target obstacle with which the unmanned device has interactive behavior according to the determination result specifically comprises:

if the planned path and a predicted path corresponding to the obstacle have an intersection point, judging whether the obstacle is the obstacle with interactive behavior with the unmanned equipment or not according to the time length of the unmanned equipment passing through the intersection point and the time length of the obstacle passing through the intersection point;

4. The method as claimed in claim 1, wherein predicting the driving strategy adopted by the target obstacle through the interaction point while considering the motion state of the unmanned device at the current time according to the motion state of the target obstacle at the current time specifically comprises:

inputting the predicted motion state of the target obstacle at the current moment predicted at the previous moment and the real motion state of the target obstacle at the current moment into a prediction model, determining the reward corresponding to each driving strategy according to the difference between the predicted motion state and the real motion state through the prediction model, and predicting the driving strategy adopted by the target obstacle through the interaction point under the condition of considering the motion state of the unmanned equipment at the current moment according to the reward corresponding to each driving strategy.

5. The method of claim 4, wherein determining the reward for each driving strategy based on the difference between the predicted motion state and the actual motion state comprises:

6. The method as claimed in claim 4, wherein predicting the driving strategy adopted by the target obstacle through the interaction point under the condition of considering the motion state of the unmanned device at the current moment according to the reward corresponding to each driving strategy specifically comprises:

determining conditional probability of executing a real driving strategy by the target barrier according to rewards corresponding to each driving strategy, wherein the real driving strategy is a driving strategy adopted when the target barrier reaches the real motion state;

determining the actual probability of the target barrier for executing each driving strategy at the current moment according to the conditional probability and the predetermined initial probability of executing each driving strategy by the target barrier;

7. The method of claim 6, wherein the method further comprises:

updating the initial probability of executing each driving strategy on the target obstacle according to the actual probability of executing each driving strategy on the target obstacle at the current moment, re-determining the updated probability as the initial probability, and predicting the driving strategy adopted by the subsequent target obstacle through the interaction point according to the re-determined initial probability.

8. The method of claim 6, wherein determining an initial probability of the target obstacle implementing each driving strategy comprises:

when the target obstacle is determined to be an obstacle with which the unmanned equipment has interactive behaviors for the first time, determining the initial probability of executing each driving strategy by the target obstacle according to the time length of the target obstacle passing through the interaction point, the time length of the unmanned equipment passing through the interaction point and a preset traffic rule.

9. The method of claim 1, wherein determining a first control parameter used in the process of the target obstacle heading to the interaction point according to the driving strategy comprises:

if the driving strategy is a yielding driving strategy, determining a passing difference corresponding to the target barrier according to the difference between the time length of the driving strategy executed by the target barrier to pass through the interaction point and the time length of the target barrier to normally pass through the interaction point;

determining a distance difference corresponding to the target barrier according to a distance difference between the target barrier and the interaction point after the target barrier executes the driving strategy for the target duration; wherein the target duration is the duration that the unmanned equipment normally passes through the interaction point;

10. The method according to claim 1, wherein determining a first control parameter used in the process of the target obstacle heading to the interaction point according to the driving strategy specifically comprises:

11. An apparatus for unmanned device control, comprising:

the system comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining a target obstacle having an interactive behavior with the unmanned equipment and predicting a position point where the target obstacle interacts with the unmanned equipment as an interaction point;

the prediction module is used for predicting a driving strategy adopted by the target obstacle through the interaction point under the condition that the motion state of the unmanned equipment at the current moment is considered according to the motion state of the target obstacle at the current moment;

the second determination module is used for determining a first control parameter adopted in the process that the target barrier moves to the interaction point according to the driving strategy;

and the control module is used for determining a second control parameter aiming at the unmanned equipment to pass through the interaction point according to the first control parameter and controlling the unmanned equipment based on the second control parameter.

12. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when being executed by a processor, carries out the method of any of the preceding claims 1-10.

13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1-10 when executing the program.