CN113848880A - Agricultural machinery path optimization method based on improved Q-learning - Google Patents
Agricultural machinery path optimization method based on improved Q-learning Download PDFInfo
- Publication number
- CN113848880A CN113848880A CN202111006894.1A CN202111006894A CN113848880A CN 113848880 A CN113848880 A CN 113848880A CN 202111006894 A CN202111006894 A CN 202111006894A CN 113848880 A CN113848880 A CN 113848880A
- Authority
- CN
- China
- Prior art keywords
- path
- agricultural machinery
- agricultural
- agricultural machine
- boundary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000005457 optimization Methods 0.000 title claims abstract description 23
- 238000004364 calculation method Methods 0.000 claims abstract description 21
- 230000009471 action Effects 0.000 claims description 29
- 230000008569 process Effects 0.000 claims description 8
- 239000000126 substance Substances 0.000 claims description 3
- 238000013519 translation Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 3
- 238000012804 iterative process Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000004888 barrier function Effects 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0223—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving speed control of the vehicle
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0221—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
Landscapes
- Engineering & Computer Science (AREA)
- Aviation & Aerospace Engineering (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
Abstract
The invention discloses an agricultural machinery path optimization method based on improved Q-learning, which comprises the following steps: s1: determining initial parameters of path planning; s2: translating the original field block boundary to the inside of the field block boundary by a distance L; s3: calculating the minimum span of the working area of the agricultural machine; s4: generating parallel paths of an agricultural machinery working area; s5: calculating the length of the turning path; s6: and optimizing the global path based on the improved Q-learning algorithm. According to the invention, the optimal rotation angle of the original field is calculated, then the parallel path of the working area of the agricultural machine parallel to the boundary when the optimal rotation angle is rotated is generated, and the parallel path is superposed with one edge of the field at the moment, so that the calculation is greatly simplified, and the overall path is optimized based on the improved Q-learning algorithm, and the minimum total length of the agricultural machine in working is determined. The planned overall path of the agricultural machinery is shortest, and the purpose of improving the working efficiency is achieved.
Description
Technical Field
The invention relates to the technical field of agricultural machinery path optimization, in particular to an agricultural machinery path optimization method based on improved Q-learning.
Background
According to the working characteristics of the agricultural machine, the travel path of the agricultural machine in the working area is generally a straight path and is required to cover the whole working area with minimum repetition. The turning path connecting two adjacent straight paths is usually determined by the relationship between the distance between the adjacent straight paths and the turning radius of the agricultural machinery, and the distances of different turning paths are different. Because the turning can seriously affect the working efficiency compared with the straight running, and the turning process of the agricultural machine can be approximately regarded as uniform speed, the working efficiency of the agricultural machine can be improved by reducing the turning times and optimizing the turning path.
For a simple convex polygonal field, all straight paths can be connected through simple turning paths, and the main factor influencing the path length of the agricultural machinery is the connection sequence of the straight paths. Therefore, the global path can be optimized by adjusting the sequence of the linear paths, and the problem of agricultural path optimization can be further converted into a hybrid optimization problem. The hybrid optimization problem is an NP-hard problem, and the traditional dynamic planning method, backtracking method and the like have a large amount of calculation by trial and error, and a globally optimal solution is not easy to find.
For a complex field with a complex shape or a barrier in the middle, if the straight paths in the whole area are planned in the same direction, repeated and omitted areas are increased easily, and the working efficiency is reduced. In addition, the turn path may become complex, presenting difficulties to both path planning and path tracking.
Disclosure of Invention
The invention provides an agricultural machinery path optimization method based on improved Q-learning, and aims to solve the technical problems that the traditional dynamic planning method, backtracking method and the like are large in calculation amount through trial and error methods, and a globally optimal solution is not easy to find.
In order to achieve the purpose, the technical scheme of the invention is as follows:
an agricultural machinery path optimization method based on improved Q-learning comprises the following steps:
s1: determining initial parameters of path planning, wherein the initial parameters comprise an original field boundary point set P, the scanning width w of each line of the agricultural machine and the minimum turning radius R of the agricultural machine;
s2: translating the original field block boundary point set P to the interior of the field block boundary by a distance L so as to determine the boundary of the agricultural machinery working area;
s3: establishing an x-y axis rectangular coordinate system, and calculating the minimum span of the agricultural machinery working area to determine the optimal rotation angle of the boundary of the original field block relative to the x axis;
s4: generating a parallel path of the boundary when the working area of the agricultural machine is parallel to the optimal rotation angle so as to determine a straight path of the agricultural machine;
s5: determining the type of the turning path, and calculating the length of the turning path;
s6: and optimizing the agricultural machinery working area path based on an improved Q-learning algorithm to determine the optimal total agricultural machinery working length.
Further, in S3, the method for calculating the minimum span of the working area of the agricultural machine is as follows:
when the boundary of the working area of the agricultural machine is a convex polygon, the turning times n of the agricultural machine are as follows:
where D is the distance from one boundary of the agricultural work area to the apex of the agricultural work area,
y=ymin,x∈[xmin,xmax] (2)
wherein, yminIs the minimum value of the boundary of the working area of the agricultural machine on the y axis; y ismaxIs the maximum value of the boundary of the working area of the agricultural machine on the y axis; x is the number ofminIs the minimum value of the boundary of the working area of the agricultural machine on the x axis; x is the number ofmaxIs the maximum value of the boundary of the working area of the agricultural machine on the x axis;
the rotation angle of each time of the original field piece is as follows:
in the formula,[x1,y1]Is the starting point coordinate of the side parallel to the x axis in the boundary of the working area of the agricultural machine; [ x ] of2,y2]Is the end point coordinate of the side parallel to the x-axis in the boundary of the working area of the agricultural machine, thetatIs a rotation angle, and after the rotation is carried out for multiple times, the rotation angle when the span D is minimum is the optimal rotation angle theta of the working area of the agricultural machine*。
Further, in S4, the method for determining the straight path of the agricultural machine includes:
s41: the optimal rotation angle is theta*The straight line is used as a scanning line to translate towards the interior of the working area of the agricultural machinery, and the line is translated for w each time; calculating the number of intersection points of the scanning lines and the boundary of the agricultural machinery working area after each translation;
s42: if the number of the intersection points is 2, the coordinates of the two intersection points are considered to be still in the boundary of the working area of the agricultural machine, and scanning is continued; if the number of the intersection points is 1 or 0, judging that the intersection points exceed the boundary range of the agricultural machinery working area, stopping scanning, and completing the generation process of the parallel path.
Further, the turning path in S5 includes a semicircular shape, a fishtail shape, and a pi shape;
if the distance between adjacent straight paths is equal to two times of the turning radius, namely w is 2R, the turning path is semicircular, and the length of the turning path is pi multiplied by R;
if the distance between adjacent straight paths is less than two times of the turning radius, w is less than 2R; the turning path is fishtail type; the straight-line distance that agricultural machinery needs to travel at this moment is:
lr=2R-w (4)
in the formula IrThe straight line distance required to be driven when the fishtail type turning path agricultural machinery turns is obtained; r is the turning radius; the length of the turning path is (2+ pi) multiplied by R-w;
if the distance between adjacent straight paths is more than two times of the turning radius, w is more than 2R; the turning path is pi-shaped; the straight-line distance that agricultural machinery needs to travel at this moment is:
lf=w-2R (5)
in the formula IfWhen the agricultural machinery is turned along the Pi-shaped turning pathThe required straight-line distance to travel; the length of the turning path at this time is (pi-2) × R + w.
Further, the method for optimizing the global path based on the Q-learning algorithm in step S6 is as follows:
s61: defining a Q value table and initializing the Q value table to start calculation;
s62: defining a state Flag table Flag to store a state quantity of whether each straight line path is connected;
s63: randomly selecting an initial path of the agricultural machine to determine a straight path when the agricultural machine starts to work;
s64: judging whether the straight path is connected or not;
s65: if the straight-line path is connected, directly judging the convergence condition of the Q value table;
if the straight path is not connected, selecting a next action set based on the current state of the agricultural machinery, and calculating the reward value of the next action set to obtain the next action when the reward value is maximum; updating the Q value table and the state Flag table Flag; then judging the convergence condition of the Q value table;
s66: if the calculation Q value table is converged, the calculation is finished;
if the calculated Q value table is not converged, judging whether a convergence element exists in the current Q value table or not;
s67: if the current Q value table does not have the convergence element, repeating S65-S66;
if the convergence element is not present in the current Q value table, the status is latched, the Q value table is updated, and the process is repeated S66.
Further, the calculation method for determining whether the straight path is connected in S64 is as follows:
wherein: f(s)n) Is the state quantity of whether each path is connected.
Further, the Q value function established by determining the convergence status of the Q value table in step S65 is:
Q(sc,a′c)=Q(sl,a′l)+γ*max(r(sc,f(sc,δ))) (7)
wherein Q(s)c,a′c) Based on the current state s of the agricultural machinerycAction a 'corresponding to the current state reward of agricultural machinery'cQ value of(s)lIs the last state of the current state of the agricultural machine; a'lThe action which corresponds to the maximum reward of the last state of the current state of the agricultural machinery; a discount factor of γ; r(s)c,sn) Is from the current state s of the agricultural machinerycTo the next state s of the current state of the agricultural machinenIs the reward function of f(s)cδ) representation is based on the current state s of the agricultural machinecAnd at the present state s of the agricultural machinecThe state set of the optional action set δ, namely:
wherein the content of the first and second substances,is the first selectable action and m is the number of selectable actions.
Further, the reward function of calculating the reward value in step S65 is set up as:
in the formula, D(s)c,sn) Is from the current state of the agricultural machine, scTo the next state s of the current state of the agricultural machinenThe length of the turn path of (a); l is a weighting coefficient.
Further, the method further includes, before the steps S1 to S6:
if there is an obstacle in the field to be optimized or the field to be optimized is not a convex polygon field, the complex field is firstly divided into a plurality of convex polygon sub-regions, and path planning is performed on the convex polygon sub-regions by the method of the steps S1-S6.
Further, after the steps S1 to S6, the method further includes: and planning paths of the sub-areas of the convex polygons to obtain the optimal path of the whole field.
Has the advantages that: according to the agricultural machinery path optimization method based on the improved Q-learning, the optimal rotation angle of an original field block is calculated, then the parallel path of the boundary when the agricultural machinery working area is parallel to the optimal rotation angle is generated, the parallel path is overlapped with one edge of the field block at the moment, therefore, the calculation is greatly simplified, the turning path is planned, the global path is optimized based on the improved Q-learning algorithm, and the minimum total length of the agricultural machinery working is determined. The planned overall path of the agricultural machinery is shortest, and the purpose of improving the working efficiency is achieved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart of an overall agricultural machine path optimization method of the present invention;
FIG. 2 is a schematic diagram of planning parallel paths according to the present invention;
FIG. 3 is a schematic diagram of a complex field being divided into a plurality of convex polygons by a cell-decomposition method;
FIG. 4a is a schematic view of a fishtail turning path and its parameters in accordance with the present invention;
FIG. 4b is a schematic view of a semicircular turn path and its parameters in accordance with the present invention;
FIG. 4c is a schematic diagram of a pi turn path and its parameters according to the present invention;
FIG. 5 is a flow chart of an agricultural machinery path optimization method based on improved Q-learning according to the present invention.
Wherein: 1. a straight path; 2. agricultural machinery work area boundary.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment provides an agricultural machinery path optimization method based on improved Q-learning, which comprises the following steps, as shown in the attached figure 1:
s1: determining initial parameters of path planning, wherein the initial parameters comprise an original field boundary point set P, the width w of each line scanning of the agricultural machine and the minimum turning radius of the agricultural machine;
s2: translating the original field block boundary point set P to the inside of the field block boundary by a distance L to determine the boundary of an agricultural machine working area, and further planning a linear path in the boundary of the agricultural machine working area;
s3: establishing an x-y axis rectangular coordinate system, and calculating the minimum span of the agricultural machinery working area to determine the optimal rotation angle of the boundary of the original field block relative to the x axis;
in S3, the method for calculating the minimum span of the agricultural machinery working area is as follows: the linear path has simple working mode and high coverage rate, and is used for covering the main working area of the agricultural machinery. The planning of the straight path focuses on finding a better advancing direction so as to reduce the number of turns. For a convex polygonal field without obstacles inside, straight paths in all directions are continuous, and the planning method is simpler. Because the working content of the agricultural machinery is generally fixed and covers the whole field, the distance between every two adjacent straight paths is equal, and therefore, the number of turns is minimized, namely, the minimum span of the convex polygon in the direction perpendicular to the straight paths is found. Since the convex polygon minimum span always occurs in the boundary of a vertex and an edge as shown in fig. 2. Therefore, the optimal direction is parallel to a certain edge of the convex polygon, the specific determination method is to calculate the span between the certain edge and the point which is farthest away from the edge, and the direction of the edge corresponding to the minimum span is selected as the optimal linear path direction of the convex polygon field.
Specifically, when the boundary of the agricultural machinery working area is a convex polygon, the number n of times of turning of the agricultural machinery is as follows:
d is the distance from one boundary of the agricultural machinery working area to the vertex of the agricultural machinery working area, and the scanning line is a straight line parallel to the x axis in the x-y axis rectangular coordinate system;
y=ymin,x∈[xmin,xmax] (2)
wherein, yminIs the minimum value of the boundary of the working area of the agricultural machine on the y axis; y ismaxIs the maximum value of the boundary of the working area of the agricultural machine on the y axis; x is the number ofminIs the minimum value of the boundary of the working area of the agricultural machine on the x axis; x is the number ofmaxIs the maximum value of the boundary of the working area of the agricultural machine on the x axis; d ═ ymax-yminTherefore, the number of turns is reduced, i.e., D is reduced;
the rotation angle of each time of the original field piece is as follows:
wherein [ x ]1,y1]Is the starting point coordinate of the side parallel to the x axis in the boundary of the working area of the agricultural machine; [ x ] of2,y2]Is the end point coordinate of the side parallel to the x-axis in the boundary of the working area of the agricultural machine, thetatIs a rotation angle, and after the rotation is carried out for multiple times, the rotation angle when the span D is minimum is the optimal rotation angle theta of the working area of the agricultural machine*;
S4: generating a parallel path of the boundary when the working area of the agricultural machine is parallel to the optimal rotation angle so as to determine a straight path of the agricultural machine; the method specifically comprises the following steps:
s41: the optimal rotation angle is theta*The straight line is used as a scanning line to translate towards the interior of the working area of the agricultural machine, and the line translates w each time; calculating the number of intersection points of the scanning lines and the boundary of the agricultural machinery working area after each translation;
s42: if the number of the intersection points is 2, the coordinates of the two intersection points are considered to be still in the boundary of the working area of the agricultural machine, and scanning is continued; if the number of the intersection points is 1 or 0, judging that the intersection points exceed the boundary range of the agricultural machinery working area, stopping scanning, and completing the generation process of the parallel path.
S5: determining the type of the turning path, and calculating the length of the turning path;
specifically, the turning path in S5 includes a semicircular shape, a fishtail shape and a pi shape, as shown in fig. 4;
specifically, since the running speed of the agricultural machine is slow, the turning path of the agricultural machine can be regarded as an arc with a fixed radius, and the turning radius of the planned path is assumed to be R. In addition, the distance between two adjacent straight paths, namely the width w scanned by each row of the agricultural machinery, is equal, so that the turning path is related to the distance between the adjacent straight paths and the turning radius.
If the distance between adjacent straight paths is equal to two times of the turning radius, namely w is 2R, the turning path is semicircular, and the length of the turning path is pi multiplied by R;
if the spacing between adjacent straight paths is less than twice the turn radius, i.e. w<2R; the turning path is fishtail type; at the moment, the agricultural machinery can not turn for 180 degrees at one time. So that the vehicle is first turned to a quarter circle and then driven straight backwardsrAfter the distance, turn one quarter circle again and finish turning, wherein:
lr=2R-w (4)
in the formula IrThe straight line distance required to be driven when the fishtail type turning path agricultural machinery turns is obtained; r is the turning radius; the length of the turning path is (2+ pi) multiplied by R-w;
if the spacing between adjacent straight paths is greater than twice the turn radius, i.e. w>2R; rotating shaftThe curved path is pi-shaped; at this time, the agricultural machinery firstly rotates a quarter circle and then moves forwards in a straight linefAfter the distance, turn one quarter circle again and finish turning, wherein:
lf=w-2R (5)
in the formula IfThe straight line distance required to be driven when the agricultural machinery turns on the pi-shaped turning path; the length of the turning path at this time is (pi-2) × R + w.
S6: optimizing the path of the agricultural machinery working area based on a Q-learning algorithm to determine the optimal total working length of the agricultural machinery;
the Q-learning-based algorithm is a classic reinforcement learning algorithm, and the algorithm optimizes the decision of the intelligent agent through the interaction result of the intelligent agent and the environment. Specifically, r (S, a) is the instant award given to the agent by the environment (i.e., the straight path of the agricultural machinery when performing action a) when the agent (i.e., the agricultural machinery) performs action a in the S state (S ∈ S) at a certain time. The agent will make an assessment of each action by performing a series of actions, from an initial state to a target state, the environment will select the optimal sequence of actions by maximizing the reward.
The method for optimizing the global path based on the improved Q-learning algorithm comprises the following steps:
s61: defining a Q value table and initializing the Q value table so as to start iterative calculation; the initial values of the Q value table are all 0;
s62: defining a state Flag table Flag to store a state quantity of whether each path is connected; the initial values of the state Flag table Flag are all 1;
s63: randomly selecting an initial straight path of the agricultural machine to determine the straight path when the agricultural machine starts to work;
s64: judging whether the straight path is connected or not;
preferably, the calculation method for determining whether the straight path is connected in S64 is as follows:
F(sn) Is the state quantity of whether each path is connected or not, and is used for recording whether the straight path is connected or not, if snHas been connected, it is set to 0, otherwise to 1Namely:
s65: if the straight-line path is connected, directly judging the convergence condition of the Q value table;
if the straight path is not connected, selecting a next action set based on the current state of the agricultural machinery, and calculating the reward value of the next action set to obtain the next action when the reward value is maximum; updating the Q value table and the state Flag table Flag; then judging the convergence condition of the Q value table;
s66: if the calculation Q value table is converged, the calculation is finished;
if the calculated Q value table is not converged, judging whether a convergence element exists in the current Q value table or not;
specifically, the agricultural path optimization is to reduce the total length of the path, and on the premise of not repeatedly walking any straight path, the global path length is related to the length of the turning path, and the length of the turning path is related to the type of the turning path, so that the global path planning can be optimized by adjusting the type of the turning path. Meanwhile, the type of the turning path is determined according to the relation between the distance and the turning radius of two adjacent (i.e. front-back connected) straight paths, so that the current straight path of the agricultural machinery (i.e. the intelligent agent) can be regarded as the current state scThe selection of the next straight path is an action acThe next linear path selected is the state s at the next timenThus, the Q-value function to be optimized is established as:
Q(sc,a′c)=Q(sl,a′l)+γ*max(r(sc,f(sc,δ))) (7)
wherein Q(s)c,a′c) Based on the current state s of the agricultural machinerycAction a 'corresponding to the current state reward of agricultural machinery'cQ value of(s)lIs the last state of the current state of the agricultural machine; a'lThe action which corresponds to the maximum reward of the last state of the current state of the agricultural machinery; discount factor of gamma: (0<γ<1);r(sc,sn) Is from the current state s of the agricultural machinerycTo the next state s of the current state of the agricultural machinenIs the reward function of f(s)cA) represents s based on the current state of the agricultural machinecAnd at the present state s of the agricultural machinecThe next possible state set for optional action set a, namely:
wherein the content of the first and second substances,is the first selectable action, m is the number of selectable actions;
for the problem of agricultural path optimization, the reward function of calculating the reward value in step S65 is set up as:
in the formula, D(s)c,sn) Is derived from the current state (i.e. straight path) s of the agricultural machinecTo the next state s of the current state of the agricultural machinenThe length of the turn path of (a); l is a weighting coefficient, and the length of the longest turning path or straight path can be selected, so that the front and the rear items are in the same order of magnitude;
s67: if the current Q value table does not have the convergence element, repeating S65-S66;
if the convergence element is not present in the current Q-value table, the state is latched, the Q-value table is updated according to the transfer relationship, and the step S66 is repeated.
Preferably, the method for optimizing the global path based on the improved Q-learning algorithm of the present invention is shown in fig. 5: assuming that a certain problem has i states and j inputs, and combining the agricultural machinery path optimization problem, firstly defining a Q value table with the initial value of each element being 0 and a state Flag table with the initial value of each element being 1, wherein the state Flag table is used for storing the state quantity of whether each path is connected or not. In this embodiment, the dimension of the Q-value table is (k × 4), where m ═ d/w ], where [ ] is the rounding calculation, and k is the number of straight paths; the dimension of the state Flag table Flag is (k × 1); secondly, determining a calculated initial state (namely an agricultural machinery starting straight path), and then finding an optimal path by the method for optimizing the global path based on the Q-learning algorithm.
The specific one-time iteration method comprises the steps of determining a possible next state set according to the current state, calculating the reward value corresponding to each state in the set, and selecting the state with the maximum reward to update a Q value table and a state Flag table Flag; and (5) the iterative process is circulated until the Q value table is converged, namely the optimal path is found. It is noted that in the iterative process, when an element of the Q-value table converges, it is locked, i.e. the decision of no longer optimizing the action of the corresponding state of the element is made. Since there are a finite number of actions corresponding to a state, when an element is locked, the next optimal state can be determined, and the state can be locked at the same time. The calculation method of the invention can avoid calculating the reward values of all possible next states each time, and after an optimal state is determined, the next state can be quickly determined according to the transfer relationship, namely the current state, thereby greatly reducing the calculation amount.
Theoretically, all the straight-line paths can become the next state, which greatly increases the iteration complexity of the Q-learning algorithm. Moreover, the repeated path increases the length of the path, which is contrary to the optimization goal, and if a path too far is used as an optional path, the calculation amount becomes large. In addition, for a pi-shaped turning path, if lfToo large is detrimental to minimizing the total path length, so the present invention sets f(s) of the next possible statescδ) is limited to a distance close to the current straight path, in particular the current state (i.e. straight path) s of the agricultural machine being connectedcNext state s of current state of agricultural machinenThe distance between the two is less than or equal to 4 w.
Preferably, in this embodiment, before the steps S1-S6, the method further includes: if the inside of the field to be optimized is provided with the obstacle or the field to be optimized is not a convex polygonal field, the path planning process is complex, the path is also complex, and the repetition rate and the leakage rate of the working path of the agricultural machine are improved. Then, firstly, a complex field is divided into a plurality of convex polygonal sub-areas by adopting a mature cell-decomposition method, as shown in fig. 3, specifically, the field is divided into a plurality of small areas, such as the areas numbered 1-7 in fig. 3, by making parallel lines through vertexes on all boundaries (including outer boundaries, obstacles and the like) of the field. The areas divided in this way are all convex polygons, the invention verifies whether the area composed of the sub-areas adjacent to each other (with common boundary) is a convex polygon, if so, the area composed is the final divided sub-area (such as the area composed of sub-areas 1 and 2 and 5 and 6 in fig. 3). So as to plan the path of the sub-regions of the convex polygons by the methods of the steps S1-S6.
Preferably, after the steps S1-S6, the method further comprises: and planning paths of the sub-areas of the convex polygons to obtain the optimal path of the whole field. Specifically, for a complex field, after path planning in the sub-area of each small convex polygon is completed, paths of the small convex polygons are connected to complete path planning of the whole field. In order to minimize the total path length, the path between the sub-areas is a straight path. Automated agricultural machinery generally works in large farmlands, so the number of convex polygons is much lower than the number of straight paths in a convex polygon. In this embodiment, the shortest connection path between convex polygons is selected by an enumeration method. Specifically, an initial field block is selected first, and then any convex polygon is selected as a target to be connected until all convex polygons are connected. And taking the sequence of all the fields as a combination, calculating the path lengths of all the combinations, and selecting the shortest combination as an optimal connection scheme to finish path planning of the whole field.
The invention has the advantages that:
1: the method decomposes the complicated farmland into a plurality of simple sub-regions, decomposes the planning problem of the global path into the problems of path planning in the sub-regions and path planning between the sub-regions, and reduces the complexity of farmland path planning. Meanwhile, an agricultural machinery path optimization method based on improved Q-learning is introduced for path planning, so that the working efficiency of the agricultural machinery is improved.
2: the method solves the problem of large calculation amount of the traditional algorithm by introducing the improved Q-learning algorithm, introduces the state mark table by combining the characteristics of the agricultural machinery path, and designs the appropriate path planning termination condition.
3: the invention locks the state which has reached the optimum in the algorithm iteration process, and then completes the optimization of the adjacent state rapidly through the transfer relationship. The computational load of the iterative process can be greatly reduced.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (10)
1. An agricultural machinery path optimization method based on improved Q-learning is characterized by comprising the following steps:
s1: determining initial parameters of path planning, wherein the initial parameters comprise an original field boundary point set P, the scanning width w of each line of the agricultural machine and the minimum turning radius R of the agricultural machine;
s2: translating the original field block boundary point set P to the interior of the field block boundary by a distance L so as to determine the boundary of the agricultural machinery working area;
s3: establishing an x-y axis rectangular coordinate system, and calculating the minimum span of the agricultural machinery working area to determine the optimal rotation angle of the boundary of the original field block relative to the x axis;
s4: generating a parallel path of the boundary when the working area of the agricultural machine is parallel to the optimal rotation angle so as to determine a straight path of the agricultural machine;
s5: determining the type of the turning path, and calculating the length of the turning path;
s6: and optimizing the agricultural machinery working area path based on an improved Q-learning algorithm to determine the optimal total agricultural machinery working length.
2. The agricultural machinery path optimization method based on the improved Q-learning of claim 1, wherein in the step S3, the method for calculating the minimum span of the agricultural machinery working area is as follows:
when the boundary of the working area of the agricultural machine is a convex polygon, the turning times n of the agricultural machine are as follows:
where D is the distance from one boundary of the agricultural work area to the apex of the agricultural work area,
y=ymin,x∈[xmin,xmax] (2)
wherein, yminIs the minimum value of the boundary of the working area of the agricultural machine on the y axis; y ismaxIs the maximum value of the boundary of the working area of the agricultural machine on the y axis; x is the number ofminIs the minimum value of the boundary of the working area of the agricultural machine on the x axis; x is the number ofmaxIs the maximum value of the boundary of the working area of the agricultural machine on the x axis;
the rotation angle of each time of the original field piece is as follows:
wherein [ x ]1,y1]Is the starting point coordinate of the side parallel to the x axis in the boundary of the working area of the agricultural machine; [ x ] of2,y2]Is the end point coordinate of the side parallel to the x-axis in the boundary of the working area of the agricultural machine, thetatIs a rotation angle, and after the rotation is carried out for multiple times, the rotation angle when the span D is minimum is the optimal rotation angle theta of the working area of the agricultural machine*。
3. The method for optimizing the path of an agricultural machine based on the improved Q-learning of claim 1, wherein in S4, the method for determining the straight path of the agricultural machine is as follows:
s41: the optimal rotation angle is theta*The straight line is used as a scanning line to translate towards the interior of the working area of the agricultural machinery, and the line is translated for w each time; calculating the number of intersection points of the scanning lines and the boundary of the agricultural machinery working area after each translation;
s42: if the number of the intersection points is 2, the coordinates of the two intersection points are considered to be still in the boundary of the working area of the agricultural machine, and scanning is continued; if the number of the intersection points is 1 or 0, judging that the intersection points exceed the boundary range of the agricultural machinery working area, stopping scanning, and completing the generation process of the parallel path.
4. The method for optimizing the agricultural machinery path based on the improved Q-learning of claim 1, wherein the turning path in S5 comprises a semicircle type, a fishtail type and a pi type;
if the distance between adjacent straight paths is equal to two times of the turning radius, namely w is 2R, the turning path is semicircular, and the length of the turning path is pi multiplied by R;
if the distance between adjacent straight paths is less than two times of the turning radius, w is less than 2R; the turning path is fishtail type; the straight-line distance that agricultural machinery needs to travel at this moment is:
lr=2R-w (4)
in the formula IrThe straight line distance required to be driven when the fishtail type turning path agricultural machinery turns is obtained; r is the turning radius; the length of the turning path is (2+ pi) multiplied by R-w;
if the distance between adjacent straight paths is more than two times of the turning radius, w is more than 2R; the turning path is pi-shaped; the straight-line distance that agricultural machinery needs to travel at this moment is:
lf=w-2R (5)
in the formula IfThe straight line distance required to be driven when the agricultural machinery turns on the pi-shaped turning path; the length of the turning path at this time is (pi-2) × R + w.
5. The method for optimizing the agricultural machinery path based on the improved Q-learning of claim 1, wherein the method for optimizing the global path based on the Q-learning algorithm in the step S6 is as follows:
s61: defining a Q value table and initializing the Q value table to start calculation;
s62: defining a state Flag table Flag to store a state quantity of whether each straight line path is connected;
s63: randomly selecting an initial path of the agricultural machine to determine a straight path when the agricultural machine starts to work;
s64: judging whether the straight path is connected or not;
s65: if the straight-line path is connected, directly judging the convergence condition of the Q value table;
if the straight path is not connected, selecting a next action set based on the current state of the agricultural machinery, and calculating the reward value of the next action set to obtain the next action when the reward value is maximum; updating the Q value table and the state Flag table Flag; then judging the convergence condition of the Q value table;
s66: if the calculation Q value table is converged, the calculation is finished;
if the calculated Q value table is not converged, judging whether a convergence element exists in the current Q value table or not;
s67: if the current Q value table does not have the convergence element, repeating S65-S66;
if the convergence element is not present in the current Q value table, the status is latched, the Q value table is updated, and the process is repeated S66.
7. The method as claimed in claim 5, wherein the Q-value function established by determining the convergence status of the Q-value table in step S65 is:
Q(sc,a′c)=Q(sl,a′l)+γ*max(r(sc,f(sc,δ))) (7)
wherein Q(s)c,a′c) Based on the current state s of the agricultural machinerycAction a 'corresponding to the current state reward of agricultural machinery'cQ value of(s)lIs the last state of the current state of the agricultural machine; a'lThe action which corresponds to the maximum reward of the last state of the current state of the agricultural machinery; a discount factor of γ; r(s)c,sn) Is from the current state s of the agricultural machinerycTo the next state s of the current state of the agricultural machinenIs the reward function of f(s)cδ) representation is based on the current state s of the agricultural machinecAnd at the present state s of the agricultural machinecThe state set of the optional action set δ, namely:
8. The method for optimizing Q-learning based agricultural machinery path according to claim 5, wherein the reward function of calculating the reward value in step S65 is established as:
in the formula, D(s)c,sn) Is thatFrom the current state of the agricultural machine, scTo the next state s of the current state of the agricultural machinenThe length of the turn path of (a); l is a weighting coefficient.
9. The method for optimizing the agricultural machinery path based on the improved Q-learning of claim 1, further comprising, before the steps S1-S6:
if there is an obstacle in the field to be optimized or the field to be optimized is not a convex polygon field, the complex field is firstly divided into a plurality of convex polygon sub-regions, and path planning is performed on the convex polygon sub-regions by the method of the steps S1-S6.
10. The method for optimizing the agricultural machinery path based on the improved Q-learning of claim 1, further comprising after the steps S1-S6: and planning paths of the sub-areas of the convex polygons to obtain the optimal path of the whole field.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111006894.1A CN113848880B (en) | 2021-08-30 | 2021-08-30 | Agricultural machinery path optimization method based on improved Q-learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111006894.1A CN113848880B (en) | 2021-08-30 | 2021-08-30 | Agricultural machinery path optimization method based on improved Q-learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113848880A true CN113848880A (en) | 2021-12-28 |
CN113848880B CN113848880B (en) | 2023-12-22 |
Family
ID=78976547
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111006894.1A Active CN113848880B (en) | 2021-08-30 | 2021-08-30 | Agricultural machinery path optimization method based on improved Q-learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113848880B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023207022A1 (en) * | 2022-04-24 | 2023-11-02 | 丰疆智能软件科技(南京)有限公司 | Path planning method and system for automatic operation of agricultural machinery, and device and storage medium |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102167038A (en) * | 2010-12-03 | 2011-08-31 | 北京农业信息技术研究中心 | Method and device for generating all-region-covering optimal working path for farmland plot |
CN107807644A (en) * | 2017-10-30 | 2018-03-16 | 洛阳中科龙网创新科技有限公司 | A kind of farm machinery consumption minimization trajectory path planning method |
CN108089185A (en) * | 2017-03-10 | 2018-05-29 | 南京沃杨机械科技有限公司 | The unmanned air navigation aid of agricultural machinery perceived based on farm environment |
EP3363273A1 (en) * | 2017-02-16 | 2018-08-22 | Amazonen-Werke H. Dreyer GmbH & Co. KG | Agricultural machine system and method for planning lanes for processing an agricultural field |
CN109828575A (en) * | 2019-02-22 | 2019-05-31 | 山东省计算中心(国家超级计算济南中心) | A kind of paths planning method effectively improving agricultural machinery working efficiency |
US20190208695A1 (en) * | 2015-12-03 | 2019-07-11 | Mogens Max Sophus Edzard Graf Plessen | Path Planning for Area Coverage |
CN110597288A (en) * | 2019-09-29 | 2019-12-20 | 陈�峰 | Algorithm based on agricultural machinery field unmanned operation path planning |
CN111189444A (en) * | 2020-03-26 | 2020-05-22 | 洛阳智能农业装备研究院有限公司 | Automatic driving agricultural machinery field operation path planning system and planning method |
CN111580514A (en) * | 2020-05-07 | 2020-08-25 | 中国船舶重工集团公司第七一六研究所 | Mobile robot optimal path covering method based on combined formation |
CN111639811A (en) * | 2020-06-01 | 2020-09-08 | 中国农业大学 | Multi-agricultural-machine cooperative work remote management scheduling method based on improved ant colony algorithm |
CN111721296A (en) * | 2020-06-04 | 2020-09-29 | 中国海洋大学 | Data driving path planning method for underwater unmanned vehicle |
CN112015176A (en) * | 2020-08-14 | 2020-12-01 | 合肥工业大学 | Unmanned tractor field operation path planning method and device |
CN112197775A (en) * | 2020-11-12 | 2021-01-08 | 扬州大学 | Agricultural machinery multi-machine cooperative operation path planning method |
CN113190017A (en) * | 2021-05-24 | 2021-07-30 | 东南大学 | Harvesting robot operation path planning method based on improved ant colony algorithm |
CN113313784A (en) * | 2021-04-29 | 2021-08-27 | 北京农业智能装备技术研究中心 | Method and device for making farmland picture based on unmanned agricultural machine |
-
2021
- 2021-08-30 CN CN202111006894.1A patent/CN113848880B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102167038A (en) * | 2010-12-03 | 2011-08-31 | 北京农业信息技术研究中心 | Method and device for generating all-region-covering optimal working path for farmland plot |
US20190208695A1 (en) * | 2015-12-03 | 2019-07-11 | Mogens Max Sophus Edzard Graf Plessen | Path Planning for Area Coverage |
EP3363273A1 (en) * | 2017-02-16 | 2018-08-22 | Amazonen-Werke H. Dreyer GmbH & Co. KG | Agricultural machine system and method for planning lanes for processing an agricultural field |
CN108089185A (en) * | 2017-03-10 | 2018-05-29 | 南京沃杨机械科技有限公司 | The unmanned air navigation aid of agricultural machinery perceived based on farm environment |
CN107807644A (en) * | 2017-10-30 | 2018-03-16 | 洛阳中科龙网创新科技有限公司 | A kind of farm machinery consumption minimization trajectory path planning method |
CN109828575A (en) * | 2019-02-22 | 2019-05-31 | 山东省计算中心(国家超级计算济南中心) | A kind of paths planning method effectively improving agricultural machinery working efficiency |
CN110597288A (en) * | 2019-09-29 | 2019-12-20 | 陈�峰 | Algorithm based on agricultural machinery field unmanned operation path planning |
CN111189444A (en) * | 2020-03-26 | 2020-05-22 | 洛阳智能农业装备研究院有限公司 | Automatic driving agricultural machinery field operation path planning system and planning method |
CN111580514A (en) * | 2020-05-07 | 2020-08-25 | 中国船舶重工集团公司第七一六研究所 | Mobile robot optimal path covering method based on combined formation |
CN111639811A (en) * | 2020-06-01 | 2020-09-08 | 中国农业大学 | Multi-agricultural-machine cooperative work remote management scheduling method based on improved ant colony algorithm |
CN111721296A (en) * | 2020-06-04 | 2020-09-29 | 中国海洋大学 | Data driving path planning method for underwater unmanned vehicle |
CN112015176A (en) * | 2020-08-14 | 2020-12-01 | 合肥工业大学 | Unmanned tractor field operation path planning method and device |
CN112197775A (en) * | 2020-11-12 | 2021-01-08 | 扬州大学 | Agricultural machinery multi-machine cooperative operation path planning method |
CN113313784A (en) * | 2021-04-29 | 2021-08-27 | 北京农业智能装备技术研究中心 | Method and device for making farmland picture based on unmanned agricultural machine |
CN113190017A (en) * | 2021-05-24 | 2021-07-30 | 东南大学 | Harvesting robot operation path planning method based on improved ant colony algorithm |
Non-Patent Citations (1)
Title |
---|
孟志军;刘卉;王华;付卫强;: "农田作业机械路径优化方法", 农业机械学报, no. 06, pages 147 - 152 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023207022A1 (en) * | 2022-04-24 | 2023-11-02 | 丰疆智能软件科技(南京)有限公司 | Path planning method and system for automatic operation of agricultural machinery, and device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113848880B (en) | 2023-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110962130B (en) | Heuristic RRT mechanical arm motion planning method based on target deviation optimization | |
CN110347151B (en) | Robot path planning method fused with Bezier optimization genetic algorithm | |
CN109542106A (en) | A kind of paths planning method under mobile robot multi-constraint condition | |
CN108413963B (en) | Self-learning ant colony algorithm-based strip robot path planning method | |
CN114035572B (en) | Obstacle avoidance tour method and system for mowing robot | |
CN113110520B (en) | Robot path planning method based on multiple intelligent optimization parallel algorithms | |
CN113064426A (en) | Intelligent vehicle path planning method for improving bidirectional fast search random tree algorithm | |
CN116242383B (en) | Unmanned vehicle path planning method based on reinforced Harris eagle algorithm | |
CN115014362B (en) | Cattle-ploughing type full-coverage path planning method and device based on synthesis unit | |
CN113848880A (en) | Agricultural machinery path optimization method based on improved Q-learning | |
CN109931943B (en) | Unmanned ship global path planning method and electronic equipment | |
CN113296520A (en) | Routing planning method for inspection robot by fusing A and improved Hui wolf algorithm | |
CN114545921B (en) | Unmanned vehicle path planning algorithm based on improved RRT algorithm | |
CN113686344A (en) | Agricultural machinery coverage path planning method | |
CN115454062A (en) | Robot dynamic path planning method and system based on Betz curve | |
CN116880497A (en) | Full-coverage path planning method, device and equipment for automatic agricultural machine | |
CN114815845A (en) | Automatic driving agricultural machinery smooth path planning method based on hybrid A-x algorithm | |
CN115167398A (en) | Unmanned ship path planning method based on improved A star algorithm | |
CN110749332B (en) | Curvature optimization method and device of RS curve, computer equipment and storage medium | |
CN113074738A (en) | Hybrid intelligent path planning method and device based on Dyna framework | |
CN115056222A (en) | Mechanical arm path planning method based on improved RRT algorithm | |
Backman et al. | Path generation method with steering rate constraint | |
CN113733095A (en) | Three-dimensional motion gait generation method for wheel-free snake-shaped robot | |
CN112215440A (en) | Method, device and equipment for realizing operation control of agricultural vehicle | |
Wang et al. | A dual-robot cooperative welding path planning algorithm based on improved ant colony optimization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |