CN113848880A - Agricultural machinery path optimization method based on improved Q-learning - Google Patents

Agricultural machinery path optimization method based on improved Q-learning Download PDF

Info

Publication number
CN113848880A
CN113848880A CN202111006894.1A CN202111006894A CN113848880A CN 113848880 A CN113848880 A CN 113848880A CN 202111006894 A CN202111006894 A CN 202111006894A CN 113848880 A CN113848880 A CN 113848880A
Authority
CN
China
Prior art keywords
path
agricultural machinery
agricultural
agricultural machine
boundary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111006894.1A
Other languages
Chinese (zh)
Other versions
CN113848880B (en
Inventor
董笑辰
陶斯友
纪铁生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CRRC Dalian R&D Co Ltd
Original Assignee
CRRC Dalian R&D Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CRRC Dalian R&D Co Ltd filed Critical CRRC Dalian R&D Co Ltd
Priority to CN202111006894.1A priority Critical patent/CN113848880B/en
Publication of CN113848880A publication Critical patent/CN113848880A/en
Application granted granted Critical
Publication of CN113848880B publication Critical patent/CN113848880B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0223Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving speed control of the vehicle
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0221Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process

Landscapes

  • Engineering & Computer Science (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

The invention discloses an agricultural machinery path optimization method based on improved Q-learning, which comprises the following steps: s1: determining initial parameters of path planning; s2: translating the original field block boundary to the inside of the field block boundary by a distance L; s3: calculating the minimum span of the working area of the agricultural machine; s4: generating parallel paths of an agricultural machinery working area; s5: calculating the length of the turning path; s6: and optimizing the global path based on the improved Q-learning algorithm. According to the invention, the optimal rotation angle of the original field is calculated, then the parallel path of the working area of the agricultural machine parallel to the boundary when the optimal rotation angle is rotated is generated, and the parallel path is superposed with one edge of the field at the moment, so that the calculation is greatly simplified, and the overall path is optimized based on the improved Q-learning algorithm, and the minimum total length of the agricultural machine in working is determined. The planned overall path of the agricultural machinery is shortest, and the purpose of improving the working efficiency is achieved.

Description

Agricultural machinery path optimization method based on improved Q-learning
Technical Field
The invention relates to the technical field of agricultural machinery path optimization, in particular to an agricultural machinery path optimization method based on improved Q-learning.
Background
According to the working characteristics of the agricultural machine, the travel path of the agricultural machine in the working area is generally a straight path and is required to cover the whole working area with minimum repetition. The turning path connecting two adjacent straight paths is usually determined by the relationship between the distance between the adjacent straight paths and the turning radius of the agricultural machinery, and the distances of different turning paths are different. Because the turning can seriously affect the working efficiency compared with the straight running, and the turning process of the agricultural machine can be approximately regarded as uniform speed, the working efficiency of the agricultural machine can be improved by reducing the turning times and optimizing the turning path.
For a simple convex polygonal field, all straight paths can be connected through simple turning paths, and the main factor influencing the path length of the agricultural machinery is the connection sequence of the straight paths. Therefore, the global path can be optimized by adjusting the sequence of the linear paths, and the problem of agricultural path optimization can be further converted into a hybrid optimization problem. The hybrid optimization problem is an NP-hard problem, and the traditional dynamic planning method, backtracking method and the like have a large amount of calculation by trial and error, and a globally optimal solution is not easy to find.
For a complex field with a complex shape or a barrier in the middle, if the straight paths in the whole area are planned in the same direction, repeated and omitted areas are increased easily, and the working efficiency is reduced. In addition, the turn path may become complex, presenting difficulties to both path planning and path tracking.
Disclosure of Invention
The invention provides an agricultural machinery path optimization method based on improved Q-learning, and aims to solve the technical problems that the traditional dynamic planning method, backtracking method and the like are large in calculation amount through trial and error methods, and a globally optimal solution is not easy to find.
In order to achieve the purpose, the technical scheme of the invention is as follows:
an agricultural machinery path optimization method based on improved Q-learning comprises the following steps:
s1: determining initial parameters of path planning, wherein the initial parameters comprise an original field boundary point set P, the scanning width w of each line of the agricultural machine and the minimum turning radius R of the agricultural machine;
s2: translating the original field block boundary point set P to the interior of the field block boundary by a distance L so as to determine the boundary of the agricultural machinery working area;
s3: establishing an x-y axis rectangular coordinate system, and calculating the minimum span of the agricultural machinery working area to determine the optimal rotation angle of the boundary of the original field block relative to the x axis;
s4: generating a parallel path of the boundary when the working area of the agricultural machine is parallel to the optimal rotation angle so as to determine a straight path of the agricultural machine;
s5: determining the type of the turning path, and calculating the length of the turning path;
s6: and optimizing the agricultural machinery working area path based on an improved Q-learning algorithm to determine the optimal total agricultural machinery working length.
Further, in S3, the method for calculating the minimum span of the working area of the agricultural machine is as follows:
when the boundary of the working area of the agricultural machine is a convex polygon, the turning times n of the agricultural machine are as follows:
Figure BDA0003237534050000021
where D is the distance from one boundary of the agricultural work area to the apex of the agricultural work area,
y=ymin,x∈[xmin,xmax] (2)
wherein, yminIs the minimum value of the boundary of the working area of the agricultural machine on the y axis; y ismaxIs the maximum value of the boundary of the working area of the agricultural machine on the y axis; x is the number ofminIs the minimum value of the boundary of the working area of the agricultural machine on the x axis; x is the number ofmaxIs the maximum value of the boundary of the working area of the agricultural machine on the x axis;
the rotation angle of each time of the original field piece is as follows:
Figure BDA0003237534050000022
in the formula,[x1,y1]Is the starting point coordinate of the side parallel to the x axis in the boundary of the working area of the agricultural machine; [ x ] of2,y2]Is the end point coordinate of the side parallel to the x-axis in the boundary of the working area of the agricultural machine, thetatIs a rotation angle, and after the rotation is carried out for multiple times, the rotation angle when the span D is minimum is the optimal rotation angle theta of the working area of the agricultural machine*
Further, in S4, the method for determining the straight path of the agricultural machine includes:
s41: the optimal rotation angle is theta*The straight line is used as a scanning line to translate towards the interior of the working area of the agricultural machinery, and the line is translated for w each time; calculating the number of intersection points of the scanning lines and the boundary of the agricultural machinery working area after each translation;
s42: if the number of the intersection points is 2, the coordinates of the two intersection points are considered to be still in the boundary of the working area of the agricultural machine, and scanning is continued; if the number of the intersection points is 1 or 0, judging that the intersection points exceed the boundary range of the agricultural machinery working area, stopping scanning, and completing the generation process of the parallel path.
Further, the turning path in S5 includes a semicircular shape, a fishtail shape, and a pi shape;
if the distance between adjacent straight paths is equal to two times of the turning radius, namely w is 2R, the turning path is semicircular, and the length of the turning path is pi multiplied by R;
if the distance between adjacent straight paths is less than two times of the turning radius, w is less than 2R; the turning path is fishtail type; the straight-line distance that agricultural machinery needs to travel at this moment is:
lr=2R-w (4)
in the formula IrThe straight line distance required to be driven when the fishtail type turning path agricultural machinery turns is obtained; r is the turning radius; the length of the turning path is (2+ pi) multiplied by R-w;
if the distance between adjacent straight paths is more than two times of the turning radius, w is more than 2R; the turning path is pi-shaped; the straight-line distance that agricultural machinery needs to travel at this moment is:
lf=w-2R (5)
in the formula IfWhen the agricultural machinery is turned along the Pi-shaped turning pathThe required straight-line distance to travel; the length of the turning path at this time is (pi-2) × R + w.
Further, the method for optimizing the global path based on the Q-learning algorithm in step S6 is as follows:
s61: defining a Q value table and initializing the Q value table to start calculation;
s62: defining a state Flag table Flag to store a state quantity of whether each straight line path is connected;
s63: randomly selecting an initial path of the agricultural machine to determine a straight path when the agricultural machine starts to work;
s64: judging whether the straight path is connected or not;
s65: if the straight-line path is connected, directly judging the convergence condition of the Q value table;
if the straight path is not connected, selecting a next action set based on the current state of the agricultural machinery, and calculating the reward value of the next action set to obtain the next action when the reward value is maximum; updating the Q value table and the state Flag table Flag; then judging the convergence condition of the Q value table;
s66: if the calculation Q value table is converged, the calculation is finished;
if the calculated Q value table is not converged, judging whether a convergence element exists in the current Q value table or not;
s67: if the current Q value table does not have the convergence element, repeating S65-S66;
if the convergence element is not present in the current Q value table, the status is latched, the Q value table is updated, and the process is repeated S66.
Further, the calculation method for determining whether the straight path is connected in S64 is as follows:
Figure BDA0003237534050000041
wherein: f(s)n) Is the state quantity of whether each path is connected.
Further, the Q value function established by determining the convergence status of the Q value table in step S65 is:
Q(sc,a′c)=Q(sl,a′l)+γ*max(r(sc,f(sc,δ))) (7)
wherein Q(s)c,a′c) Based on the current state s of the agricultural machinerycAction a 'corresponding to the current state reward of agricultural machinery'cQ value of(s)lIs the last state of the current state of the agricultural machine; a'lThe action which corresponds to the maximum reward of the last state of the current state of the agricultural machinery; a discount factor of γ; r(s)c,sn) Is from the current state s of the agricultural machinerycTo the next state s of the current state of the agricultural machinenIs the reward function of f(s)cδ) representation is based on the current state s of the agricultural machinecAnd at the present state s of the agricultural machinecThe state set of the optional action set δ, namely:
Figure BDA0003237534050000042
wherein the content of the first and second substances,
Figure BDA0003237534050000043
is the first selectable action and m is the number of selectable actions.
Further, the reward function of calculating the reward value in step S65 is set up as:
Figure BDA0003237534050000044
in the formula, D(s)c,sn) Is from the current state of the agricultural machine, scTo the next state s of the current state of the agricultural machinenThe length of the turn path of (a); l is a weighting coefficient.
Further, the method further includes, before the steps S1 to S6:
if there is an obstacle in the field to be optimized or the field to be optimized is not a convex polygon field, the complex field is firstly divided into a plurality of convex polygon sub-regions, and path planning is performed on the convex polygon sub-regions by the method of the steps S1-S6.
Further, after the steps S1 to S6, the method further includes: and planning paths of the sub-areas of the convex polygons to obtain the optimal path of the whole field.
Has the advantages that: according to the agricultural machinery path optimization method based on the improved Q-learning, the optimal rotation angle of an original field block is calculated, then the parallel path of the boundary when the agricultural machinery working area is parallel to the optimal rotation angle is generated, the parallel path is overlapped with one edge of the field block at the moment, therefore, the calculation is greatly simplified, the turning path is planned, the global path is optimized based on the improved Q-learning algorithm, and the minimum total length of the agricultural machinery working is determined. The planned overall path of the agricultural machinery is shortest, and the purpose of improving the working efficiency is achieved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart of an overall agricultural machine path optimization method of the present invention;
FIG. 2 is a schematic diagram of planning parallel paths according to the present invention;
FIG. 3 is a schematic diagram of a complex field being divided into a plurality of convex polygons by a cell-decomposition method;
FIG. 4a is a schematic view of a fishtail turning path and its parameters in accordance with the present invention;
FIG. 4b is a schematic view of a semicircular turn path and its parameters in accordance with the present invention;
FIG. 4c is a schematic diagram of a pi turn path and its parameters according to the present invention;
FIG. 5 is a flow chart of an agricultural machinery path optimization method based on improved Q-learning according to the present invention.
Wherein: 1. a straight path; 2. agricultural machinery work area boundary.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment provides an agricultural machinery path optimization method based on improved Q-learning, which comprises the following steps, as shown in the attached figure 1:
s1: determining initial parameters of path planning, wherein the initial parameters comprise an original field boundary point set P, the width w of each line scanning of the agricultural machine and the minimum turning radius of the agricultural machine;
s2: translating the original field block boundary point set P to the inside of the field block boundary by a distance L to determine the boundary of an agricultural machine working area, and further planning a linear path in the boundary of the agricultural machine working area;
s3: establishing an x-y axis rectangular coordinate system, and calculating the minimum span of the agricultural machinery working area to determine the optimal rotation angle of the boundary of the original field block relative to the x axis;
in S3, the method for calculating the minimum span of the agricultural machinery working area is as follows: the linear path has simple working mode and high coverage rate, and is used for covering the main working area of the agricultural machinery. The planning of the straight path focuses on finding a better advancing direction so as to reduce the number of turns. For a convex polygonal field without obstacles inside, straight paths in all directions are continuous, and the planning method is simpler. Because the working content of the agricultural machinery is generally fixed and covers the whole field, the distance between every two adjacent straight paths is equal, and therefore, the number of turns is minimized, namely, the minimum span of the convex polygon in the direction perpendicular to the straight paths is found. Since the convex polygon minimum span always occurs in the boundary of a vertex and an edge as shown in fig. 2. Therefore, the optimal direction is parallel to a certain edge of the convex polygon, the specific determination method is to calculate the span between the certain edge and the point which is farthest away from the edge, and the direction of the edge corresponding to the minimum span is selected as the optimal linear path direction of the convex polygon field.
Specifically, when the boundary of the agricultural machinery working area is a convex polygon, the number n of times of turning of the agricultural machinery is as follows:
Figure BDA0003237534050000061
d is the distance from one boundary of the agricultural machinery working area to the vertex of the agricultural machinery working area, and the scanning line is a straight line parallel to the x axis in the x-y axis rectangular coordinate system;
y=ymin,x∈[xmin,xmax] (2)
wherein, yminIs the minimum value of the boundary of the working area of the agricultural machine on the y axis; y ismaxIs the maximum value of the boundary of the working area of the agricultural machine on the y axis; x is the number ofminIs the minimum value of the boundary of the working area of the agricultural machine on the x axis; x is the number ofmaxIs the maximum value of the boundary of the working area of the agricultural machine on the x axis; d ═ ymax-yminTherefore, the number of turns is reduced, i.e., D is reduced;
the rotation angle of each time of the original field piece is as follows:
Figure BDA0003237534050000062
wherein [ x ]1,y1]Is the starting point coordinate of the side parallel to the x axis in the boundary of the working area of the agricultural machine; [ x ] of2,y2]Is the end point coordinate of the side parallel to the x-axis in the boundary of the working area of the agricultural machine, thetatIs a rotation angle, and after the rotation is carried out for multiple times, the rotation angle when the span D is minimum is the optimal rotation angle theta of the working area of the agricultural machine*
S4: generating a parallel path of the boundary when the working area of the agricultural machine is parallel to the optimal rotation angle so as to determine a straight path of the agricultural machine; the method specifically comprises the following steps:
s41: the optimal rotation angle is theta*The straight line is used as a scanning line to translate towards the interior of the working area of the agricultural machine, and the line translates w each time; calculating the number of intersection points of the scanning lines and the boundary of the agricultural machinery working area after each translation;
s42: if the number of the intersection points is 2, the coordinates of the two intersection points are considered to be still in the boundary of the working area of the agricultural machine, and scanning is continued; if the number of the intersection points is 1 or 0, judging that the intersection points exceed the boundary range of the agricultural machinery working area, stopping scanning, and completing the generation process of the parallel path.
S5: determining the type of the turning path, and calculating the length of the turning path;
specifically, the turning path in S5 includes a semicircular shape, a fishtail shape and a pi shape, as shown in fig. 4;
specifically, since the running speed of the agricultural machine is slow, the turning path of the agricultural machine can be regarded as an arc with a fixed radius, and the turning radius of the planned path is assumed to be R. In addition, the distance between two adjacent straight paths, namely the width w scanned by each row of the agricultural machinery, is equal, so that the turning path is related to the distance between the adjacent straight paths and the turning radius.
If the distance between adjacent straight paths is equal to two times of the turning radius, namely w is 2R, the turning path is semicircular, and the length of the turning path is pi multiplied by R;
if the spacing between adjacent straight paths is less than twice the turn radius, i.e. w<2R; the turning path is fishtail type; at the moment, the agricultural machinery can not turn for 180 degrees at one time. So that the vehicle is first turned to a quarter circle and then driven straight backwardsrAfter the distance, turn one quarter circle again and finish turning, wherein:
lr=2R-w (4)
in the formula IrThe straight line distance required to be driven when the fishtail type turning path agricultural machinery turns is obtained; r is the turning radius; the length of the turning path is (2+ pi) multiplied by R-w;
if the spacing between adjacent straight paths is greater than twice the turn radius, i.e. w>2R; rotating shaftThe curved path is pi-shaped; at this time, the agricultural machinery firstly rotates a quarter circle and then moves forwards in a straight linefAfter the distance, turn one quarter circle again and finish turning, wherein:
lf=w-2R (5)
in the formula IfThe straight line distance required to be driven when the agricultural machinery turns on the pi-shaped turning path; the length of the turning path at this time is (pi-2) × R + w.
S6: optimizing the path of the agricultural machinery working area based on a Q-learning algorithm to determine the optimal total working length of the agricultural machinery;
the Q-learning-based algorithm is a classic reinforcement learning algorithm, and the algorithm optimizes the decision of the intelligent agent through the interaction result of the intelligent agent and the environment. Specifically, r (S, a) is the instant award given to the agent by the environment (i.e., the straight path of the agricultural machinery when performing action a) when the agent (i.e., the agricultural machinery) performs action a in the S state (S ∈ S) at a certain time. The agent will make an assessment of each action by performing a series of actions, from an initial state to a target state, the environment will select the optimal sequence of actions by maximizing the reward.
The method for optimizing the global path based on the improved Q-learning algorithm comprises the following steps:
s61: defining a Q value table and initializing the Q value table so as to start iterative calculation; the initial values of the Q value table are all 0;
s62: defining a state Flag table Flag to store a state quantity of whether each path is connected; the initial values of the state Flag table Flag are all 1;
s63: randomly selecting an initial straight path of the agricultural machine to determine the straight path when the agricultural machine starts to work;
s64: judging whether the straight path is connected or not;
preferably, the calculation method for determining whether the straight path is connected in S64 is as follows:
F(sn) Is the state quantity of whether each path is connected or not, and is used for recording whether the straight path is connected or not, if snHas been connected, it is set to 0, otherwise to 1Namely:
Figure BDA0003237534050000081
s65: if the straight-line path is connected, directly judging the convergence condition of the Q value table;
if the straight path is not connected, selecting a next action set based on the current state of the agricultural machinery, and calculating the reward value of the next action set to obtain the next action when the reward value is maximum; updating the Q value table and the state Flag table Flag; then judging the convergence condition of the Q value table;
s66: if the calculation Q value table is converged, the calculation is finished;
if the calculated Q value table is not converged, judging whether a convergence element exists in the current Q value table or not;
specifically, the agricultural path optimization is to reduce the total length of the path, and on the premise of not repeatedly walking any straight path, the global path length is related to the length of the turning path, and the length of the turning path is related to the type of the turning path, so that the global path planning can be optimized by adjusting the type of the turning path. Meanwhile, the type of the turning path is determined according to the relation between the distance and the turning radius of two adjacent (i.e. front-back connected) straight paths, so that the current straight path of the agricultural machinery (i.e. the intelligent agent) can be regarded as the current state scThe selection of the next straight path is an action acThe next linear path selected is the state s at the next timenThus, the Q-value function to be optimized is established as:
Q(sc,a′c)=Q(sl,a′l)+γ*max(r(sc,f(sc,δ))) (7)
wherein Q(s)c,a′c) Based on the current state s of the agricultural machinerycAction a 'corresponding to the current state reward of agricultural machinery'cQ value of(s)lIs the last state of the current state of the agricultural machine; a'lThe action which corresponds to the maximum reward of the last state of the current state of the agricultural machinery; discount factor of gamma: (0<γ<1);r(sc,sn) Is from the current state s of the agricultural machinerycTo the next state s of the current state of the agricultural machinenIs the reward function of f(s)cA) represents s based on the current state of the agricultural machinecAnd at the present state s of the agricultural machinecThe next possible state set for optional action set a, namely:
Figure BDA0003237534050000091
wherein the content of the first and second substances,
Figure BDA0003237534050000092
is the first selectable action, m is the number of selectable actions;
for the problem of agricultural path optimization, the reward function of calculating the reward value in step S65 is set up as:
Figure BDA0003237534050000093
in the formula, D(s)c,sn) Is derived from the current state (i.e. straight path) s of the agricultural machinecTo the next state s of the current state of the agricultural machinenThe length of the turn path of (a); l is a weighting coefficient, and the length of the longest turning path or straight path can be selected, so that the front and the rear items are in the same order of magnitude;
s67: if the current Q value table does not have the convergence element, repeating S65-S66;
if the convergence element is not present in the current Q-value table, the state is latched, the Q-value table is updated according to the transfer relationship, and the step S66 is repeated.
Preferably, the method for optimizing the global path based on the improved Q-learning algorithm of the present invention is shown in fig. 5: assuming that a certain problem has i states and j inputs, and combining the agricultural machinery path optimization problem, firstly defining a Q value table with the initial value of each element being 0 and a state Flag table with the initial value of each element being 1, wherein the state Flag table is used for storing the state quantity of whether each path is connected or not. In this embodiment, the dimension of the Q-value table is (k × 4), where m ═ d/w ], where [ ] is the rounding calculation, and k is the number of straight paths; the dimension of the state Flag table Flag is (k × 1); secondly, determining a calculated initial state (namely an agricultural machinery starting straight path), and then finding an optimal path by the method for optimizing the global path based on the Q-learning algorithm.
The specific one-time iteration method comprises the steps of determining a possible next state set according to the current state, calculating the reward value corresponding to each state in the set, and selecting the state with the maximum reward to update a Q value table and a state Flag table Flag; and (5) the iterative process is circulated until the Q value table is converged, namely the optimal path is found. It is noted that in the iterative process, when an element of the Q-value table converges, it is locked, i.e. the decision of no longer optimizing the action of the corresponding state of the element is made. Since there are a finite number of actions corresponding to a state, when an element is locked, the next optimal state can be determined, and the state can be locked at the same time. The calculation method of the invention can avoid calculating the reward values of all possible next states each time, and after an optimal state is determined, the next state can be quickly determined according to the transfer relationship, namely the current state, thereby greatly reducing the calculation amount.
Theoretically, all the straight-line paths can become the next state, which greatly increases the iteration complexity of the Q-learning algorithm. Moreover, the repeated path increases the length of the path, which is contrary to the optimization goal, and if a path too far is used as an optional path, the calculation amount becomes large. In addition, for a pi-shaped turning path, if lfToo large is detrimental to minimizing the total path length, so the present invention sets f(s) of the next possible statescδ) is limited to a distance close to the current straight path, in particular the current state (i.e. straight path) s of the agricultural machine being connectedcNext state s of current state of agricultural machinenThe distance between the two is less than or equal to 4 w.
Preferably, in this embodiment, before the steps S1-S6, the method further includes: if the inside of the field to be optimized is provided with the obstacle or the field to be optimized is not a convex polygonal field, the path planning process is complex, the path is also complex, and the repetition rate and the leakage rate of the working path of the agricultural machine are improved. Then, firstly, a complex field is divided into a plurality of convex polygonal sub-areas by adopting a mature cell-decomposition method, as shown in fig. 3, specifically, the field is divided into a plurality of small areas, such as the areas numbered 1-7 in fig. 3, by making parallel lines through vertexes on all boundaries (including outer boundaries, obstacles and the like) of the field. The areas divided in this way are all convex polygons, the invention verifies whether the area composed of the sub-areas adjacent to each other (with common boundary) is a convex polygon, if so, the area composed is the final divided sub-area (such as the area composed of sub-areas 1 and 2 and 5 and 6 in fig. 3). So as to plan the path of the sub-regions of the convex polygons by the methods of the steps S1-S6.
Preferably, after the steps S1-S6, the method further comprises: and planning paths of the sub-areas of the convex polygons to obtain the optimal path of the whole field. Specifically, for a complex field, after path planning in the sub-area of each small convex polygon is completed, paths of the small convex polygons are connected to complete path planning of the whole field. In order to minimize the total path length, the path between the sub-areas is a straight path. Automated agricultural machinery generally works in large farmlands, so the number of convex polygons is much lower than the number of straight paths in a convex polygon. In this embodiment, the shortest connection path between convex polygons is selected by an enumeration method. Specifically, an initial field block is selected first, and then any convex polygon is selected as a target to be connected until all convex polygons are connected. And taking the sequence of all the fields as a combination, calculating the path lengths of all the combinations, and selecting the shortest combination as an optimal connection scheme to finish path planning of the whole field.
The invention has the advantages that:
1: the method decomposes the complicated farmland into a plurality of simple sub-regions, decomposes the planning problem of the global path into the problems of path planning in the sub-regions and path planning between the sub-regions, and reduces the complexity of farmland path planning. Meanwhile, an agricultural machinery path optimization method based on improved Q-learning is introduced for path planning, so that the working efficiency of the agricultural machinery is improved.
2: the method solves the problem of large calculation amount of the traditional algorithm by introducing the improved Q-learning algorithm, introduces the state mark table by combining the characteristics of the agricultural machinery path, and designs the appropriate path planning termination condition.
3: the invention locks the state which has reached the optimum in the algorithm iteration process, and then completes the optimization of the adjacent state rapidly through the transfer relationship. The computational load of the iterative process can be greatly reduced.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. An agricultural machinery path optimization method based on improved Q-learning is characterized by comprising the following steps:
s1: determining initial parameters of path planning, wherein the initial parameters comprise an original field boundary point set P, the scanning width w of each line of the agricultural machine and the minimum turning radius R of the agricultural machine;
s2: translating the original field block boundary point set P to the interior of the field block boundary by a distance L so as to determine the boundary of the agricultural machinery working area;
s3: establishing an x-y axis rectangular coordinate system, and calculating the minimum span of the agricultural machinery working area to determine the optimal rotation angle of the boundary of the original field block relative to the x axis;
s4: generating a parallel path of the boundary when the working area of the agricultural machine is parallel to the optimal rotation angle so as to determine a straight path of the agricultural machine;
s5: determining the type of the turning path, and calculating the length of the turning path;
s6: and optimizing the agricultural machinery working area path based on an improved Q-learning algorithm to determine the optimal total agricultural machinery working length.
2. The agricultural machinery path optimization method based on the improved Q-learning of claim 1, wherein in the step S3, the method for calculating the minimum span of the agricultural machinery working area is as follows:
when the boundary of the working area of the agricultural machine is a convex polygon, the turning times n of the agricultural machine are as follows:
Figure FDA0003237534040000011
where D is the distance from one boundary of the agricultural work area to the apex of the agricultural work area,
y=ymin,x∈[xmin,xmax] (2)
wherein, yminIs the minimum value of the boundary of the working area of the agricultural machine on the y axis; y ismaxIs the maximum value of the boundary of the working area of the agricultural machine on the y axis; x is the number ofminIs the minimum value of the boundary of the working area of the agricultural machine on the x axis; x is the number ofmaxIs the maximum value of the boundary of the working area of the agricultural machine on the x axis;
the rotation angle of each time of the original field piece is as follows:
Figure FDA0003237534040000012
wherein [ x ]1,y1]Is the starting point coordinate of the side parallel to the x axis in the boundary of the working area of the agricultural machine; [ x ] of2,y2]Is the end point coordinate of the side parallel to the x-axis in the boundary of the working area of the agricultural machine, thetatIs a rotation angle, and after the rotation is carried out for multiple times, the rotation angle when the span D is minimum is the optimal rotation angle theta of the working area of the agricultural machine*
3. The method for optimizing the path of an agricultural machine based on the improved Q-learning of claim 1, wherein in S4, the method for determining the straight path of the agricultural machine is as follows:
s41: the optimal rotation angle is theta*The straight line is used as a scanning line to translate towards the interior of the working area of the agricultural machinery, and the line is translated for w each time; calculating the number of intersection points of the scanning lines and the boundary of the agricultural machinery working area after each translation;
s42: if the number of the intersection points is 2, the coordinates of the two intersection points are considered to be still in the boundary of the working area of the agricultural machine, and scanning is continued; if the number of the intersection points is 1 or 0, judging that the intersection points exceed the boundary range of the agricultural machinery working area, stopping scanning, and completing the generation process of the parallel path.
4. The method for optimizing the agricultural machinery path based on the improved Q-learning of claim 1, wherein the turning path in S5 comprises a semicircle type, a fishtail type and a pi type;
if the distance between adjacent straight paths is equal to two times of the turning radius, namely w is 2R, the turning path is semicircular, and the length of the turning path is pi multiplied by R;
if the distance between adjacent straight paths is less than two times of the turning radius, w is less than 2R; the turning path is fishtail type; the straight-line distance that agricultural machinery needs to travel at this moment is:
lr=2R-w (4)
in the formula IrThe straight line distance required to be driven when the fishtail type turning path agricultural machinery turns is obtained; r is the turning radius; the length of the turning path is (2+ pi) multiplied by R-w;
if the distance between adjacent straight paths is more than two times of the turning radius, w is more than 2R; the turning path is pi-shaped; the straight-line distance that agricultural machinery needs to travel at this moment is:
lf=w-2R (5)
in the formula IfThe straight line distance required to be driven when the agricultural machinery turns on the pi-shaped turning path; the length of the turning path at this time is (pi-2) × R + w.
5. The method for optimizing the agricultural machinery path based on the improved Q-learning of claim 1, wherein the method for optimizing the global path based on the Q-learning algorithm in the step S6 is as follows:
s61: defining a Q value table and initializing the Q value table to start calculation;
s62: defining a state Flag table Flag to store a state quantity of whether each straight line path is connected;
s63: randomly selecting an initial path of the agricultural machine to determine a straight path when the agricultural machine starts to work;
s64: judging whether the straight path is connected or not;
s65: if the straight-line path is connected, directly judging the convergence condition of the Q value table;
if the straight path is not connected, selecting a next action set based on the current state of the agricultural machinery, and calculating the reward value of the next action set to obtain the next action when the reward value is maximum; updating the Q value table and the state Flag table Flag; then judging the convergence condition of the Q value table;
s66: if the calculation Q value table is converged, the calculation is finished;
if the calculated Q value table is not converged, judging whether a convergence element exists in the current Q value table or not;
s67: if the current Q value table does not have the convergence element, repeating S65-S66;
if the convergence element is not present in the current Q value table, the status is latched, the Q value table is updated, and the process is repeated S66.
6. The agricultural machinery path optimization method based on the improved Q-learning of claim 5, wherein the calculation method for judging whether the straight path is connected in the S64 is as follows:
Figure FDA0003237534040000031
wherein: f(s)n) Is the state quantity of whether each path is connected.
7. The method as claimed in claim 5, wherein the Q-value function established by determining the convergence status of the Q-value table in step S65 is:
Q(sc,a′c)=Q(sl,a′l)+γ*max(r(sc,f(sc,δ))) (7)
wherein Q(s)c,a′c) Based on the current state s of the agricultural machinerycAction a 'corresponding to the current state reward of agricultural machinery'cQ value of(s)lIs the last state of the current state of the agricultural machine; a'lThe action which corresponds to the maximum reward of the last state of the current state of the agricultural machinery; a discount factor of γ; r(s)c,sn) Is from the current state s of the agricultural machinerycTo the next state s of the current state of the agricultural machinenIs the reward function of f(s)cδ) representation is based on the current state s of the agricultural machinecAnd at the present state s of the agricultural machinecThe state set of the optional action set δ, namely:
Figure FDA0003237534040000032
wherein the content of the first and second substances,
Figure FDA0003237534040000033
is the first selectable action and m is the number of selectable actions.
8. The method for optimizing Q-learning based agricultural machinery path according to claim 5, wherein the reward function of calculating the reward value in step S65 is established as:
Figure FDA0003237534040000034
in the formula, D(s)c,sn) Is thatFrom the current state of the agricultural machine, scTo the next state s of the current state of the agricultural machinenThe length of the turn path of (a); l is a weighting coefficient.
9. The method for optimizing the agricultural machinery path based on the improved Q-learning of claim 1, further comprising, before the steps S1-S6:
if there is an obstacle in the field to be optimized or the field to be optimized is not a convex polygon field, the complex field is firstly divided into a plurality of convex polygon sub-regions, and path planning is performed on the convex polygon sub-regions by the method of the steps S1-S6.
10. The method for optimizing the agricultural machinery path based on the improved Q-learning of claim 1, further comprising after the steps S1-S6: and planning paths of the sub-areas of the convex polygons to obtain the optimal path of the whole field.
CN202111006894.1A 2021-08-30 2021-08-30 Agricultural machinery path optimization method based on improved Q-learning Active CN113848880B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111006894.1A CN113848880B (en) 2021-08-30 2021-08-30 Agricultural machinery path optimization method based on improved Q-learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111006894.1A CN113848880B (en) 2021-08-30 2021-08-30 Agricultural machinery path optimization method based on improved Q-learning

Publications (2)

Publication Number Publication Date
CN113848880A true CN113848880A (en) 2021-12-28
CN113848880B CN113848880B (en) 2023-12-22

Family

ID=78976547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111006894.1A Active CN113848880B (en) 2021-08-30 2021-08-30 Agricultural machinery path optimization method based on improved Q-learning

Country Status (1)

Country Link
CN (1) CN113848880B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023207022A1 (en) * 2022-04-24 2023-11-02 丰疆智能软件科技(南京)有限公司 Path planning method and system for automatic operation of agricultural machinery, and device and storage medium

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102167038A (en) * 2010-12-03 2011-08-31 北京农业信息技术研究中心 Method and device for generating all-region-covering optimal working path for farmland plot
CN107807644A (en) * 2017-10-30 2018-03-16 洛阳中科龙网创新科技有限公司 A kind of farm machinery consumption minimization trajectory path planning method
CN108089185A (en) * 2017-03-10 2018-05-29 南京沃杨机械科技有限公司 The unmanned air navigation aid of agricultural machinery perceived based on farm environment
EP3363273A1 (en) * 2017-02-16 2018-08-22 Amazonen-Werke H. Dreyer GmbH & Co. KG Agricultural machine system and method for planning lanes for processing an agricultural field
CN109828575A (en) * 2019-02-22 2019-05-31 山东省计算中心(国家超级计算济南中心) A kind of paths planning method effectively improving agricultural machinery working efficiency
US20190208695A1 (en) * 2015-12-03 2019-07-11 Mogens Max Sophus Edzard Graf Plessen Path Planning for Area Coverage
CN110597288A (en) * 2019-09-29 2019-12-20 陈�峰 Algorithm based on agricultural machinery field unmanned operation path planning
CN111189444A (en) * 2020-03-26 2020-05-22 洛阳智能农业装备研究院有限公司 Automatic driving agricultural machinery field operation path planning system and planning method
CN111580514A (en) * 2020-05-07 2020-08-25 中国船舶重工集团公司第七一六研究所 Mobile robot optimal path covering method based on combined formation
CN111639811A (en) * 2020-06-01 2020-09-08 中国农业大学 Multi-agricultural-machine cooperative work remote management scheduling method based on improved ant colony algorithm
CN111721296A (en) * 2020-06-04 2020-09-29 中国海洋大学 Data driving path planning method for underwater unmanned vehicle
CN112015176A (en) * 2020-08-14 2020-12-01 合肥工业大学 Unmanned tractor field operation path planning method and device
CN112197775A (en) * 2020-11-12 2021-01-08 扬州大学 Agricultural machinery multi-machine cooperative operation path planning method
CN113190017A (en) * 2021-05-24 2021-07-30 东南大学 Harvesting robot operation path planning method based on improved ant colony algorithm
CN113313784A (en) * 2021-04-29 2021-08-27 北京农业智能装备技术研究中心 Method and device for making farmland picture based on unmanned agricultural machine

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102167038A (en) * 2010-12-03 2011-08-31 北京农业信息技术研究中心 Method and device for generating all-region-covering optimal working path for farmland plot
US20190208695A1 (en) * 2015-12-03 2019-07-11 Mogens Max Sophus Edzard Graf Plessen Path Planning for Area Coverage
EP3363273A1 (en) * 2017-02-16 2018-08-22 Amazonen-Werke H. Dreyer GmbH & Co. KG Agricultural machine system and method for planning lanes for processing an agricultural field
CN108089185A (en) * 2017-03-10 2018-05-29 南京沃杨机械科技有限公司 The unmanned air navigation aid of agricultural machinery perceived based on farm environment
CN107807644A (en) * 2017-10-30 2018-03-16 洛阳中科龙网创新科技有限公司 A kind of farm machinery consumption minimization trajectory path planning method
CN109828575A (en) * 2019-02-22 2019-05-31 山东省计算中心(国家超级计算济南中心) A kind of paths planning method effectively improving agricultural machinery working efficiency
CN110597288A (en) * 2019-09-29 2019-12-20 陈�峰 Algorithm based on agricultural machinery field unmanned operation path planning
CN111189444A (en) * 2020-03-26 2020-05-22 洛阳智能农业装备研究院有限公司 Automatic driving agricultural machinery field operation path planning system and planning method
CN111580514A (en) * 2020-05-07 2020-08-25 中国船舶重工集团公司第七一六研究所 Mobile robot optimal path covering method based on combined formation
CN111639811A (en) * 2020-06-01 2020-09-08 中国农业大学 Multi-agricultural-machine cooperative work remote management scheduling method based on improved ant colony algorithm
CN111721296A (en) * 2020-06-04 2020-09-29 中国海洋大学 Data driving path planning method for underwater unmanned vehicle
CN112015176A (en) * 2020-08-14 2020-12-01 合肥工业大学 Unmanned tractor field operation path planning method and device
CN112197775A (en) * 2020-11-12 2021-01-08 扬州大学 Agricultural machinery multi-machine cooperative operation path planning method
CN113313784A (en) * 2021-04-29 2021-08-27 北京农业智能装备技术研究中心 Method and device for making farmland picture based on unmanned agricultural machine
CN113190017A (en) * 2021-05-24 2021-07-30 东南大学 Harvesting robot operation path planning method based on improved ant colony algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孟志军;刘卉;王华;付卫强;: "农田作业机械路径优化方法", 农业机械学报, no. 06, pages 147 - 152 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023207022A1 (en) * 2022-04-24 2023-11-02 丰疆智能软件科技(南京)有限公司 Path planning method and system for automatic operation of agricultural machinery, and device and storage medium

Also Published As

Publication number Publication date
CN113848880B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
CN110962130B (en) Heuristic RRT mechanical arm motion planning method based on target deviation optimization
CN110347151B (en) Robot path planning method fused with Bezier optimization genetic algorithm
CN109542106A (en) A kind of paths planning method under mobile robot multi-constraint condition
CN108413963B (en) Self-learning ant colony algorithm-based strip robot path planning method
CN114035572B (en) Obstacle avoidance tour method and system for mowing robot
CN113110520B (en) Robot path planning method based on multiple intelligent optimization parallel algorithms
CN113064426A (en) Intelligent vehicle path planning method for improving bidirectional fast search random tree algorithm
CN116242383B (en) Unmanned vehicle path planning method based on reinforced Harris eagle algorithm
CN115014362B (en) Cattle-ploughing type full-coverage path planning method and device based on synthesis unit
CN113848880A (en) Agricultural machinery path optimization method based on improved Q-learning
CN109931943B (en) Unmanned ship global path planning method and electronic equipment
CN113296520A (en) Routing planning method for inspection robot by fusing A and improved Hui wolf algorithm
CN114545921B (en) Unmanned vehicle path planning algorithm based on improved RRT algorithm
CN113686344A (en) Agricultural machinery coverage path planning method
CN115454062A (en) Robot dynamic path planning method and system based on Betz curve
CN116880497A (en) Full-coverage path planning method, device and equipment for automatic agricultural machine
CN114815845A (en) Automatic driving agricultural machinery smooth path planning method based on hybrid A-x algorithm
CN115167398A (en) Unmanned ship path planning method based on improved A star algorithm
CN110749332B (en) Curvature optimization method and device of RS curve, computer equipment and storage medium
CN113074738A (en) Hybrid intelligent path planning method and device based on Dyna framework
CN115056222A (en) Mechanical arm path planning method based on improved RRT algorithm
Backman et al. Path generation method with steering rate constraint
CN113733095A (en) Three-dimensional motion gait generation method for wheel-free snake-shaped robot
CN112215440A (en) Method, device and equipment for realizing operation control of agricultural vehicle
Wang et al. A dual-robot cooperative welding path planning algorithm based on improved ant colony optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant