CN113219834B

CN113219834B - Scraper knife control method, model and device, computer equipment and storage medium

Info

Publication number: CN113219834B
Application number: CN202110532630.3A
Authority: CN
Inventors: 周诚; 胡滨; 骆汉宾; 杨继红; 尤轲; 王彦东; 李迟典; 张士聪
Original assignee: Huazhong University of Science and Technology; Shantui Chutian Construction Machinery Co Ltd
Current assignee: Huazhong University of Science and Technology; Shantui Chutian Construction Machinery Co Ltd
Priority date: 2021-05-17
Filing date: 2021-05-17
Publication date: 2023-04-07
Anticipated expiration: 2041-05-17
Also published as: CN113219834A

Abstract

The embodiment of the invention discloses a scraper knife control method, a scraper knife control model, a scraper knife control device, computer equipment and a storage medium. The scraper knife control method comprises the following steps: acquiring curved surface information of a target curved surface at the current moment; acquiring target coordinate information of a mark point on a scraper knife at the current moment; determining a value function corresponding to each control strategy based on the scraper knife control model, the curved surface information and the target coordinate information, wherein the scraper knife control model comprises a plurality of control strategies; and controlling the movement of the scraper knife by utilizing a control strategy corresponding to the maximum function in a plurality of control strategies. The embodiment of the invention can improve the control precision of the scraper knife.

Description

Scraper knife control method, model and device, computer equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of machinery, in particular to a scraper knife control method, a scraper knife control model, a scraper knife control device, computer equipment and a storage medium.

Background

A bulldozer is an engineering machine, and has wide application in civil engineering, water conservancy, agriculture and forestry and the like. The method can improve the engineering operation efficiency and is very important in the construction process.

Bulldozers mainly work by means of a blade, and at present, an operator usually controls the movement of the blade according to an actual construction scene. However, the control difficulty of the scraper knife is generally high, which results in low control precision of a manual control mode.

Disclosure of Invention

Embodiments of the present invention provide a method, a model, a device, a computer device, and a storage medium for controlling a blade, which can accurately control a state of the blade, so that a motion trajectory of the blade conforms to a target curved surface, thereby improving control accuracy of the blade.

In a first aspect, an embodiment of the present invention provides a method for controlling a blade, where the method includes:

acquiring curved surface information of a target curved surface at the current moment;

acquiring target coordinate information of the mark point on the scraper knife at the current moment;

determining a value function corresponding to each control strategy based on a scraper knife control model, the curved surface information and the target coordinate information, wherein the scraper knife control model comprises a plurality of control strategies;

and controlling the movement of the scraper knife by utilizing a control strategy corresponding to the maximum function in the multiple control strategies.

In a second aspect, an embodiment of the present invention provides a blade control model, where the blade control model includes multiple control strategies;

the scraper knife control model is used for processing input curved surface information and target coordinate information, determining a value function corresponding to each control strategy and outputting a control strategy corresponding to a maximum value function;

and the control strategy corresponding to the maximum function is used for controlling the movement of the scraper knife.

In a third aspect, an embodiment of the present invention provides a blade control device, including:

the first acquisition module is used for acquiring the curved surface information of the target curved surface at the current moment;

the second acquisition module is used for acquiring target coordinate information of the mark point on the scraper knife at the current moment;

the determining module is used for determining a value function corresponding to each control strategy based on the scraper knife control model, the curved surface information and the target coordinate information, wherein the scraper knife control model comprises a plurality of control strategies;

and the control module is used for controlling the movement of the scraper knife by utilizing the control strategy corresponding to the maximum function in the multiple control strategies.

In a fourth aspect, an embodiment of the present invention provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the blade control method according to any one of the first aspect when executing the program.

In a fifth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the blade control method according to any one of the first aspect.

According to the method and the device, the curved surface information of the target curved surface at the current moment and the target coordinate information of the mark points on the scraper knife are obtained, the value function corresponding to each control strategy is determined based on the scraper knife control model and the obtained curved surface information and the target coordinate information, and then the control strategy corresponding to the maximum value function in multiple control strategies is utilized to control the movement of the scraper knife. When the control strategy corresponding to the maximum function is adopted to control the movement of the scraper knife, the state of the scraper knife can be accurately controlled, and the movement track of the scraper knife can be made to accord with the target curved surface. The problem of lower control accuracy of manual control mode among the correlation technique is solved, the control accuracy of spiller has been improved.

Drawings

Fig. 1 is a schematic flow chart of a method for controlling a blade according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of another blade control method according to an embodiment of the present invention;

fig. 3 is a schematic diagram of coordinates of a mark point in a camera of a camera device according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a process of converting coordinates of a camera according to an embodiment of the present invention;

fig. 5 is a schematic flowchart of a method for determining a value function corresponding to each control strategy according to an embodiment of the present invention;

fig. 6 is a block diagram of a blade control apparatus according to an embodiment of the present invention;

FIG. 7 is a block diagram of a determination module provided by an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some structures related to the present invention are shown in the drawings, not all of them.

Engineering mechanization is a long-term development trend, and a bulldozer as an indispensable engineering machine in construction has wide application in the aspects of civil engineering, water conservancy, agriculture and forestry and the like. At present, the movement of the scraper knife is usually controlled manually by an operator, but the state of the scraper knife is usually difficult to control, so that the control precision of a manual control mode is low, and the control difficulty of the operator is high. Therefore, the bulldozer cannot be applied to the construction field with higher precision requirement, and the wide use of the bulldozer is influenced.

Embodiments of the present invention provide a blade control method, which may be performed by a blade control apparatus, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in a computer device. The method can be applied to application scenes for controlling the movement of the scraper knife. Referring to fig. 1, fig. 1 is a schematic flow chart of a blade control method according to an embodiment of the present invention, where the method includes the following steps:

step 101, obtaining curved surface information of a target curved surface at the current moment.

Optionally, the curved surface information may include at least one of: an equation of the target curved surface (e.g., an equation in world coordinates), coordinates of the target curved surface (e.g., three-dimensional coordinates in a world coordinate system), a model of the target curved surface (e.g., a three-dimensional model), and a spacing of a cutting edge of the blade from the target curved surface.

And 102, acquiring target coordinate information of the mark point on the scraper knife at the current moment.

The target coordinate information may include world coordinates (i.e., coordinates in a world coordinate system) of the marked points. The shovel blade is provided with a camera device and a positioning device, and the computer equipment can obtain target coordinate information by combining the position of the positioning device and image information, position and Inertial Measurement Unit (IMU) data of the camera device according to hand-eye calibration and a parallax principle. For example, camera coordinates of a marking point of the blade in the camera coordinate system may be determined based on the image information, and then the camera coordinates may be converted into world coordinates based on the position of the camera, IMU data, and the position of the positioning device to obtain target coordinate information.

The camera device may include a binocular camera (including a left camera and a right camera) having an IMU built therein, and the positioning device may include a Real Time Kinematic (RTK) device.

And 103, determining a value function corresponding to each control strategy based on the cutting blade control model, the curved surface information and the target coordinate information, wherein the cutting blade control model comprises a plurality of control strategies.

And 104, controlling the movement of the scraper knife by using a control strategy corresponding to the maximum function in the plurality of control strategies.

Alternatively, the output of the blade control model may be a control strategy corresponding to a maximum function. The control strategy corresponding to the maximum function is the control strategy meeting the following formula:

the control strategy corresponding to the maximum function can be used for controlling the stretching of the oil cylinder of the bulldozer, and then the movement of the shovel blade can be controlled.

In summary, the blade control method provided in the embodiment of the present invention obtains the curved surface information of the target curved surface at the current time and the target coordinate information of the mark point on the blade, determines the value function corresponding to each control strategy based on the blade control model and the obtained curved surface information and target coordinate information, and then controls the movement of the blade by using the control strategy corresponding to the maximum value function among the multiple control strategies. When the control strategy corresponding to the maximum function is adopted to control the movement of the scraper knife, the state of the scraper knife can be accurately controlled, so that the movement track of the scraper knife conforms to the target curved surface, and the control precision of the scraper knife is improved.

The embodiment of the present invention provides another blade control method, which takes the implementation of blade control by using a blade control model as an example, and before describing the method, the blade control model is described below:

the blade control model may include a Partially Observable Markov Decision Process (POMDP) model that may predict a control strategy based on Partially Observable information. The POMDP model may include a state set S, a behavior set a, a state transition function T, a belief state B, a reward function R, an observation set O, and an observation function Ω, in addition to the various control strategies.

The state set S includes a plurality of blade states, which are fixed after determination. Each blade state includes three-dimensional coordinates of two state points, which are located on two sides of the blade (i.e., two sides of the cutting edge of the blade). Illustratively, the state set S may be S = { S = { S } ₁ ,s ₂ ,…,s _n It includes n blade states. Each scraper blade state can be a six-element array (x) _l ,y _l ,z _l ,x _r ,y _r ,z _r ) Wherein the first three numbers x _l 、y _l And z _l The coordinates of the state point on the left side of the blade, the last three numbers x _r 、y _r And z _r Coordinates representing the state point on the right side of the blade. The quantity of the scraper knife states in the state set S is related to the control precision, the more the quantity of the scraper knife states is, the smaller the difference between the two adjacent states is, and the higher the scraper knife control precision is.

Action set a includes a plurality of blade actions, which are fixed after determination. Optionally, the blade action may include at least one of: lifting the scraper knife, lowering the scraper knife, inclining the scraper knife leftwards, inclining the scraper knife rightwards and keeping the same. Or shovelThe knife action may include at least one of: the control method comprises the following steps of lifting a fast scraper, lifting a slow scraper, descending the fast scraper, descending the slow scraper, leftwards inclining the fast scraper, leftwards inclining the slow scraper, rightwards inclining the fast scraper and rightwards inclining the slow scraper, and keeping the left-inclined and the right-inclined of the slow scraper unchanged, so that one action is divided into fast and slow actions, and the control precision of the scraper can be further improved. In the embodiment of the present invention, the behavior set a may be a = { a = { a } ₁ ,a ₂ ,a ₃ ,a ₄ ,a ₅ And f, respectively representing the lifting of the scraper knife, the descending of the scraper knife, the right inclination of the scraper knife, the left inclination of the scraper knife and the maintenance of the scraper knife.

The blade control model comprises a plurality of control strategies, each control strategy corresponds to an observation action, and the observation actions belong to a behavior set A. The plurality of control strategies pi are fixed after being predetermined, and illustratively, the plurality of control strategies pi may include pi ₁ = adjusting the lifting of the blade first and then the inclination of the blade, pi ₂ And = adjusting the inclination of the scraper knife and then adjusting the lifting of the scraper knife, etc.

The state transfer function T represents the state change rule of the shovel blade and is fixed after being determined. The state transition function T represents the probability that any blade state (i.e., the initial blade state) in the state set S passes through any action in the behavior set a to obtain a subsequent blade state S ', and S' belongs to the state set S. Takes the state of any scraper knife as s ₁ And either action is a ₁ For example, s ₁ In passing action a ₁ The probability of obtaining the blade state s 'can be expressed as T (s' | s) ₁ ,a ₁ ). The state transition function T may be determined by at least one of: published literature, domain experts, personal experience, historical data obtained, and stochastic simulations.

For example, referring to table 1, table 1 illustrates the manner in which the initial blade state s determines the subsequent blade state s' after undergoing various blade actions. For example, after the initial blade state S is lowered by the blade operation, z 'in the state set S is determined' _l ＜z _l &z′ _l Max and z' _r ＜z _r &z′ _r The maximum blade state is the subsequent blade state s'. Namely, the left shape in the state set S is determined firstAnd then determining the maximum vertical axis coordinate of the left state point and the right state point in the determined shovel blade state as a subsequent shovel blade state s'.

TABLE 1

When the subsequent blade state s' is determined according to the table 1, for any initial blade state, only one subsequent blade state can be obtained after any blade action. For any subsequent scraper knife state, only one initial scraper knife state can obtain any subsequent scraper knife state after any scraper knife acts. That is, for a kind of scraper blade action, the initial scraper blade state and the subsequent scraper blade state are in one-to-one correspondence, the probability of obtaining one scraper blade state after any scraper blade state is 1, and the probability of obtaining other scraper blade states is 0.

Illustratively, referring to Table 2, table 2 shows a set of states S (including S) ₁ To s ₅ ) Each blade state of (a) through action a ₁ Then obtaining probability values of various cutter states, such as cutter state s ₃ Through action a ₁ Then obtaining the state s of the scraper knife ₄ The probability value of (2) is 1, and the probability values of the obtained states of other scraper knives are all 0.

TABLE 2

The belief state B represents an estimated value of each blade state in the state set S, which changes in the subsequent, and includes a plurality of probability values in one-to-one correspondence with the plurality of blade states included in the state set S. The belief state B can be B = { p = { (p) ₁ ,p ₂ ,…,p _n The number of the elements in the state set S is the same as that of the elements in the state set B, and the elements in the state set S correspond to one another. b(s) _i )＝p _i Indicating the state of the blade at s _i The probability of (a) of (b) being,

it should be noted that, when the curved surface information and the target coordinate information at the current time (i.e., the time t = 0) are not obtained, the blade is not yet constructed, and at this time, the belief state may be determined by a technician, and the belief state is updated every time the subsequent blade control model obtains the curved surface information and the target coordinate information. When the scraper knife is not constructed, the scraper knife is horizontally placed on the ground, and a technician can determine the belief state B at the moment t =0 according to the belief state.

The reward function R represents the degree of proximity of the blade state to the target curved surface, and can be expressed by a calculation formula, the calculation formula is fixed, and the reward function can be determined based on the state set S and the curved surface information. For any blade state, the sum of absolute values of differences between two state points in the blade state and corresponding target state points on the target curved surface can be calculated, and then the opposite number of the sum of the absolute values is determined as the reward function R. Namely, the smaller the difference between the state of any scraper knife and the target curved surface is, the larger the reward function is. The point on the target surface with the coordinate value (x value) of the horizontal axis and the coordinate value (y value) of the vertical axis of any state point is the target state point. For example, the state of any blade is s = (x) _l ,y _l ,z _l ,x _r ,y _r ,z _r ) Then the corresponding target state point on the target surface is (x) _l ,y _l ,z _l ^goal ,x _r ,y _r ,z _r ^goal )。

Illustratively, the calculation formula of the reward function R is as follows:

wherein R(s) represents a reward function, z _l A vertical axis coordinate value (z value) indicating a left state point in any blade state,

representing a target musicVertical axis coordinate value, z, of the target state point corresponding to the left state point on the plane _r A vertical axis coordinate value representing a right state point in any one of the blade states, and a value based on the value of the vertical axis coordinate value>

And the vertical axis coordinate value of the target state point corresponding to the right state point on the target curved surface is represented.

Since the true state of the blade in the POMDP model is uncertain, the reward function R can be represented by the belief state B, and the calculation formula of the reward function is represented by the belief state as:

/>

wherein, R (b) represents the reward function expressed by the belief state, S represents the state set, b (S) represents the belief state, and R (S) represents the reward function calculated by the above formula.

The observation set O includes a plurality of coordinate information of the marker points, and is fixed after the determination. Illustratively, the observation set may be O = { O = { O ₁ ,o ₂ ,…,o _n Which includes n pieces of coordinate information, or may be O = { O = } ₁ ～o ₂ ,o ₂ ～o ₃ ,…,o _n-1 ～o _n Which includes coordinate information within n-1 coordinate information ranges. When the mark points include a left mark point and a right mark point, each coordinate information may be a six-element array

Wherein the first three number +>

And &>

The last three numbers which represent the coordinates of the left marking point>

And &>

The coordinates of the marked point on the right side are indicated.

The observation function omega represents the state and observation rule of the scraper knife and is fixed after being determined. The observation function Ω indicates the probability that any coordinate information in the observation set is observed in the subsequent blade state s' obtained after any action. Any action as a ₁ And any coordinate information is o ₁ For example, in the process of action a ₁ Then o is observed in the state s' of the blade ₁ The probability of (c) can be expressed as P (o) ₁ |s′,a)、Ω(s′,a,o ₁ ) Or Ω (s', o) ₁ And a) are provided. The observation function Ω may be determined by at least one of: published literature, domain experts, personal experience, historical data obtained, and stochastic simulations.

For example, please refer to tables 3 and 4, and the observation function is described by taking tables 3 and 4 as examples. Table 3 shows the pass action a ₁ The resulting set of states S (including S) ₁ To s ₅ ) Observing the coordinate information in the observation set O (including O) in each blade state ₁ To o ₅ ) The probability value of (2). Table 4 shows the pass action a ₁ The resulting set of states S (including S) ₁ To s ₅ ) And observing the probability value of the coordinate information in each coordinate information range in each scraper knife state, wherein each coordinate information in the observation set O belongs to each coordinate information range.

As shown in table 3, pass through action a ₁ The obtained state s of the blade ₁ In which o is observed ₁ And o ₂ Has probability values of 0.9 and 0.1, respectively, o is observed ₃ To o ₅ The probability values of (1) are all 0. As shown in table 4, pass action a ₁ The state s of the blade obtained ₂ In which o is observed ₁ ～o ₂ 、o ₂ ～o ₃ And o ₃ ～o ₄ The probability values of the coordinate information in the range are 0.1, 0.8 and 0.1, respectively, and o is observed ₄ ～o ₅ And o ₅ ～o ₆ The probability values of the coordinate information within the range are all 0.

TABLE 3

TABLE 4

Referring to fig. 2, fig. 2 is a schematic flow chart of another blade control method according to an embodiment of the present invention, where the method includes the following steps:

step 201, obtaining the curved surface information of the target curved surface at the current moment.

For example, the equation for the target surface may include: z = f (x, y). The curved surface information may be determined by a technician based on the job site's job objective, the mode of operation of the bulldozer blade, and the actual conditions at the job site.

Step 202, determining the focal length and the base line of the camera device installed on the scraper knife.

The computer device may determine the focal length (f) and the Baseline (BL) from image information of the image pickup apparatus, and the image information may include a plurality of images photographed by the image pickup apparatus. For example, a plurality of (e.g., 10 to 20) checkerboard pictures may be captured by the image capturing device, and then the image capturing device may be calibrated by a plurality of checkerboard pictures using a calibration tool (e.g., matlab or opencv) to obtain the focal length (e.g., the focal length) of the image capturing device

And &>

) And a baseline.

And step 203, calculating the camera coordinates of the mark point at the current moment in the coordinate system of the camera device based on the focal length and the baseline.

The marked points may include at least one point to the left of the blade and at least one point to the right of the blade. The coordinate system of the image pickup device is a coordinate system established by taking the image pickup device as an origin. Alternatively, the camera coordinates may be calculated using the principle of parallax in combination with the focal length and the baseline. The computer device may calculate camera coordinates based on the focal length, the baseline, and coordinates of the marker points in the camera of the camera device. In the embodiment of the present invention, the image capturing apparatus may be a binocular camera, and accordingly, the coordinates of the mark point on the camera of the image capturing apparatus may include the coordinates of the mark point on the left camera and the coordinates of the mark point on the right camera. For example, the coordinates of the mark point on the camera of the camera device may include { x } _l ，y _l ，z _l And { x } _r ，y _r ，z _r }。

Referring to fig. 3, fig. 3 is a schematic diagram of coordinates of a mark point in a camera of an image capturing device according to an embodiment of the present invention, and fig. 3 illustrates, by taking the image capturing device as a binocular camera, a photo plane a4 of a left camera a1, a right camera a2, a base line a3, the left camera a1 and the right camera a2 of the image capturing device, a focal length f1 of the left camera a1, a focal length f2 of the right camera a2, and a coordinate value y of a coordinate of the mark point a5 on a y axis of the left camera a1 _l And the coordinate value y of the coordinate of the marking point on the right camera a2 on the y axis _r . It should be noted that only the x-axis and the y-axis are shown in fig. 3, and the z-axis is not shown, and the z-axis is perpendicular to the plane of the x-axis and the y-axis.

For example, the camera coordinates may be calculated using the following formula:

wherein x, y and z respectively represent coordinate values of the camera coordinate on the x axis, the y axis and the z axis, f represents the focal length, BL represents the base line, y represents the base line _l And y _r Respectively representing coordinate values of the marking points on the left camera and the right camera on the y axis, z _l And the coordinate value of the coordinate of the marking point on the left camera on the z axis is represented.

And step 204, determining the coordinates of the mark points in the world coordinate system based on the position of the camera device, the IMU data, the position of the positioning device and the camera coordinates.

The camera coordinates may be transformed based on the position of the camera and IMU data, the position of the positioning device, to transform the coordinates of the marker points from the camera coordinate system to the world coordinate system. The computer equipment obtains the Euler angle of a camera coordinate system and a world coordinate system based on IMU data, and then converts the camera coordinate based on the Euler angle, the position of the camera device and the position of the positioning device.

For example, the coordinates of the marked point in the world coordinate system may be determined using the following formula:

(x _w ,y _w ,z _w )＝R _z R _y R _x T _c→w (x _c ,y _c ,z _c )+(x _RTK ,y _RTK ,z _RTK )

wherein (x) _w ,y _w ,z _w ) Representing the coordinates of the marked points in the world coordinate system, R _z R _y R _x Indicating a rotation of the coordinate system according to the Euler angle, T _c→w Representing a translation matrix, T _c→w Determined according to the relative position of the camera and the positioning device (x) _c ,y _c ,z _c ) Representing camera coordinates, (x) _RTK ,y _RTK ,z _RTK ) Representing the coordinates of the positioning device in the world coordinate system.

Referring to fig. 4, fig. 4 is a schematic diagram of a process of converting coordinates of a camera according to an embodiment of the present invention, and fig. 4 shows coordinates of an image capturing apparatusA system b1, a marker b2 and a world coordinate system b3. The three coordinate axes of the imaging device coordinate system b1 are x _c 、y _c And z _c Three coordinate axes of the world coordinate system b3 are x _w 、y _w And z _w 。

And step 205, determining the coordinates of the marking points in the world coordinate system as target coordinate information.

And step 206, determining a value function corresponding to each control strategy based on the blade control model, the curved surface information and the target coordinate information, wherein the blade control model comprises a plurality of control strategies.

Referring to fig. 5, fig. 5 is a schematic flow chart of a method for determining a value function corresponding to each control strategy according to an embodiment of the present invention, where fig. 5 is described by taking the foregoing POMDP model as an example, the method may include the following steps:

step 2061, updating the belief state by using the state set, the observation action corresponding to any one of the plurality of control strategies, the target coordinate information, the belief state, the observation function and the state transfer function.

Illustratively, the belief state may be updated using a state set, observed actions corresponding to any of the control strategies, target coordinate information, belief state, observation function, state transition function, and a first formula. The first formula includes:

wherein S represents a state set, S' and S both represent the blade state in the state set, a represents an observation action corresponding to any control strategy, and o _t Representing target coordinate information, b ' (s ') representing an updated belief state, Ω (s ', a, o) _t ) Represents the observation function, T (s' | a, s) represents the state transition function, and b(s) represents the belief state before updating.

Note that in determining Ω (s', a, o) _t ) When target coordinate information exists in a plurality of pieces of coordinate information of the observation set, the subsequent scraper knife shape obtained after action a is directly usedO is observed in the state s _t Is determined as omega (s', a, o) _t ). When target coordinate information does not exist in the plurality of coordinate information of the observation set, the coordinate information o having the smallest difference from the target coordinate information can be determined _t 'thereafter, o is observed in a subsequent blade state s' obtained after the operation a _t 'the probability is determined as Ω (s', a, o) _t )。

To calculate b'(s) using the foregoing tables 2 and 3 ₃ )、o _t Is o ₂ And the belief state B is [0,0.4,0.2,0.4,0]For example, Ω(s) ₃ ，a，o ₂ )＝0.1；

b′(s ₃ )＝(0.1×0.4)/(1×0.4)＝0.1。

Step 2062, updating the observation action corresponding to any control strategy by using the curved surface information, the state set, the belief state and any control strategy, wherein the updated observation action corresponding to any control strategy belongs to the behavior set.

Each scraper knife state comprises three-dimensional coordinates of two state points, the two state points are respectively positioned on two sides of the scraper knife, and the curved surface information comprises the three-dimensional coordinates of a plurality of points. And determining the difference value of the two state points in each scraper knife state and the corresponding state point on the target curved surface on the vertical axis based on the curved surface information and the state set to obtain the corresponding difference value of each scraper knife state. And determining the scraper knife action corresponding to each scraper knife state based on the difference value corresponding to each scraper knife state, wherein the scraper knife action corresponding to any scraper knife state belongs to the action set. And determining probability values of various types of scraper blade actions in the scraper blade actions corresponding to the scraper blade states based on the belief states. And determining the updated observation action corresponding to any control strategy based on any control strategy in the multiple control strategies and the probability values of the blade actions of various types. Any control strategy can be used for indicating the arrangement sequence of various blade actions, the probability value in the various blade actions can be the maximum, and the blade action arranged at the top in any control strategy is determined as the blade action corresponding to any control strategy.

As an example of this, the following is given, assume that the state set S includes blade states S = { S = } ₁ ,s ₂ ,s ₃ ,s ₄ ,s ₅ The belief states are [0,0.4,0.2,0.4,0 ]]Any control strategy is to adjust the lifting of the scraper knife and then adjust the inclination of the scraper knife. Z' _l A vertical axis value z 'representing a state point corresponding to the left state point on the target curved surface' _r A vertical axis value, z, representing the corresponding state point on the target surface for the right state point _l Vertical axis value, z, representing the left state point _r The vertical axis value of the right state point is indicated. After the difference corresponding to each blade state is determined, s can be determined according to the difference corresponding to each blade state ₁ And s ₂ Middle z' _l ＞z _l And z' _r ＞z _r ，s ₃ Middle z' _l ＞z _l And z' _r ＜z _r ，s ₄ And s ₅ Middle z' _l ＜z _l And z' _r ＞z _r . And then determining s ₁ And s ₂ The corresponding blade action being lifting, s ₃ The corresponding blade action being a right tilt, s ₄ And s ₅ The corresponding blade action is a left tilt. Will be s in the belief state B ₁ And s ₂ The sum of the corresponding probabilities, 0.4, is determined as the elevated probability value, s ₃ The corresponding probability 0.2 is determined as the probability value of right tilt, s ₄ And s ₅ The sum of the corresponding probabilities 0.4 is determined as the left-leaning probability value. And the probability values of lifting and left leaning are equal, and the lifting action is determined to be arranged at the forefront according to any control strategy, so that the updated observation action corresponding to any control strategy is determined to be lifting.

It should be noted that, when the curved surface information and the target coordinate information at the current time (i.e., the time t = 0) are not obtained, the cutting edge is not yet constructed, at this time, the observation action corresponding to any control strategy may be determined by a technician, and each time the subsequent cutting edge control model obtains the curved surface information and the target coordinate information, the observation action corresponding to any control strategy is updated.

Step 2063, determining a value function corresponding to any control strategy based on the state set, the observation action corresponding to any control strategy, the belief state, the reward function, the state transfer function and the observation function.

The value function refers to the expected gain obtained by performing any action according to any control strategy for any blade state in the state set S. Indicating the deviation of the actual state of the blade from the current curved surface. Since the true state of the blade in the POMDP model is uncertain, the value function can be represented by the belief state B.

For example, the value function corresponding to any control strategy can be determined based on a state set, an observation action corresponding to any control strategy, a belief state, a reward function, a state transition function, an observation function, and a second formula. The second formula includes:

wherein S represents a state set, O represents an observation set, S 'and S both represent the state of the cutting blade in the state set, a represents an observation action corresponding to any control strategy, O represents coordinate information in the observation set, b (S) represents a belief state, R (S) represents a reward function, gamma represents a discount factor, T (S' a, S) represents a state transfer function, omega (S ', a, O) represents an observation function, and V (S') represents a value function. Wherein γ ∈ (0, 1), which represents a time discount with respect to the reward function R, and enables an expectation to be bounded, thereby accelerating convergence of the value function and improving the processing progress of the blade control model, it should be noted that, in this formula, the belief state and the observation action corresponding to any control strategy are both updated parameters.

When the value function is calculated according to the formula, the iterative calculation is stopped when the value of the calculated value function meets the precision requirement, and the value obtained at the moment is used as the value function corresponding to any control strategy.

The foregoing steps 2061 to 2063 are described by taking the determination of the value function corresponding to any control strategy as an example. The foregoing steps 2061 to 2063 are required to be performed for each control strategy to obtain the value function corresponding to each control strategy.

And step 207, controlling the movement of the scraper knife by using a control strategy corresponding to the maximum function in the multiple control strategies.

In the embodiment of the invention, the coordinate information of the mark point and the curved surface information of the target curved surface are changed in real time, and the computer equipment can acquire the coordinate information of the mark point and the curved surface information of the target curved surface in real time and execute the embodiment to realize the real-time control of the scraper knife.

In summary, the blade control method provided in the embodiment of the present invention obtains the curved surface information of the target curved surface at the current time and the target coordinate information of the mark point on the blade, determines the value function corresponding to each control strategy based on the blade control model and the obtained curved surface information and target coordinate information, and then controls the movement of the blade by using the control strategy corresponding to the maximum function in the multiple control strategies. When the control strategy corresponding to the maximum function is adopted to control the movement of the scraper knife, the state of the scraper knife can be accurately controlled, so that the movement track of the scraper knife accords with the target curved surface, the control precision of the scraper knife is improved, the construction operation efficiency of the scraper knife is improved, and the engineering practical value of the scraper knife is improved.

Further, the blade control model may include a Partially Observable Markov Decision Process (POMDP) model, and the Process of obtaining the target coordinate information includes: determining the focal length and the base line of a camera device arranged on the scraper knife, calculating the camera coordinates of the marking point at the current moment in the coordinate system of the camera device based on the focal length and the base line, and determining the coordinates of the marking point in the world coordinate system based on the position of the camera device, IMU data, the position of the positioning device and the camera coordinates to obtain the target coordinate information of the marking point. Therefore, the shovel blade control process based on partially observable Markov decision is established by utilizing reinforcement learning on the basis of real-time determination of the state of the shovel blade by utilizing machine vision, the state of the shovel blade can be predicted in real time, the optimal control strategy of the shovel blade is determined, the pre-control and closed-loop control of the shovel blade are realized, and the stability of the control process of the shovel blade is higher. In addition, the embodiment of the invention has lower application cost, so that the application universality is higher.

The sequence of the method provided by the above embodiment may be appropriately adjusted, and the steps may also be increased or decreased according to the situation, for example, step 201 and step 202 may be executed simultaneously. The embodiment of the present invention is not limited thereto.

Alternatively, the above embodiment is described by taking an example in which the blade control device executes the blade control method, and the blade control device may be located in the bulldozer (for example, in a vehicle-mounted controller of the bulldozer) or may be located in a computer device independent of the bulldozer. When located in a computer device separate from the bulldozer, the computer device needs to send the determined control strategy to a control device (e.g., an onboard controller) of the bulldozer, which controls the movement of the blade based on the received control strategy. In one example, different steps in the blade control method may be performed by different modules. The different modules may be located in one device or in different devices. The embodiment of the present invention does not limit the device for executing the blade control method.

The embodiment of the invention provides a scraper knife control model which comprises a plurality of control strategies. The scraper knife control model is used for processing the input curved surface information and the target coordinate information, determining a value function corresponding to each control strategy and outputting a control strategy corresponding to the maximum value function. And the control strategy corresponding to the maximum function is used for controlling the movement of the scraper knife. Optionally, the scraper knife control model may include a POMDP model, and the specific structure thereof may refer to the foregoing description, which is not repeated herein in the embodiment of the present invention.

An embodiment of the present invention provides a blade control device, and fig. 6 is a block diagram of a blade control device according to an embodiment of the present invention, where the blade control device 30 includes:

the first obtaining module 301 is configured to obtain curve information of a target curve at a current moment.

And a second obtaining module 302, configured to obtain target coordinate information of the mark point on the blade at the current time.

The determining module 303 is configured to determine a value function corresponding to each control strategy based on a blade control model, the curved surface information, and the target coordinate information, where the blade control model includes multiple control strategies.

And the control module 304 is used for controlling the movement of the scraper knife by utilizing the control strategy corresponding to the maximum function in the plurality of control strategies.

In summary, in the blade control apparatus provided in the embodiment of the present invention, the first obtaining module obtains the curved surface information of the target curved surface at the current time, and the second obtaining module obtains the target coordinate information of the mark point on the blade, the determining module determines the value function corresponding to each control strategy based on the blade control model and the obtained curved surface information and target coordinate information, and the control module controls the movement of the blade by using the control strategy corresponding to the maximum value function in the multiple control strategies. When the control strategy corresponding to the maximum function is adopted to control the movement of the scraper knife, the state of the scraper knife can be accurately controlled, so that the movement track of the scraper knife conforms to the target curved surface, and the control precision of the scraper knife is improved.

Optionally, the shovel blade is provided with a camera device and a positioning device; a second obtaining module 302, configured to:

the focal length and baseline of the camera are determined.

And calculating the camera coordinates of the current time marking point in the coordinate system of the camera device based on the focal length and the baseline.

And determining the coordinates of the mark point in the world coordinate system based on the position of the camera device, the IMU data, the position of the positioning device and the camera coordinates.

And determining the coordinates of the marking points in the world coordinate system as target coordinate information.

Optionally, each control strategy corresponds to an observation action, and the blade control model further includes: state set, behavior set, state transfer function, belief state, reward function, observation set and observation function; the state set comprises a plurality of scraper knife states, the action set comprises a plurality of scraper knife actions, the observation set comprises a plurality of coordinate information of the marking points, and the belief state comprises a plurality of probability values corresponding to the plurality of scraper knife states one to one.

Fig. 7 is a block diagram of a determining module according to an embodiment of the present invention, where the determining module 303 includes:

the first updating submodule 3031 is configured to update the belief state by using the state set, the observation action, the target coordinate information, the belief state, the observation function, and the state transition function corresponding to any one of the plurality of control strategies.

And a second updating submodule 3032, configured to update an observation action corresponding to any control policy by using the curved surface information, the state set, the belief state, and any control policy, where the updated observation action corresponding to any control policy belongs to the behavior set.

The determining submodule 3033 is configured to determine a value function corresponding to any one of the control strategies based on the state set, the observation action, the belief state, the reward function, the state transition function, and the observation function corresponding to any one of the control strategies.

Optionally, the first update submodule 3031 is configured to:

updating the belief state by using the state set, the observation action corresponding to any control strategy, the target coordinate information, the belief state, the observation function, the state transfer function and the first formula;

the first formula includes:

wherein S represents a state set, S' and S both represent the blade state in the state set, a represents an observation action corresponding to any control strategy, and o _t Representing target coordinate information, b ' (s ') representing an updated belief state, Ω (s ', a, o) _t ) Represents an observation function, T (s' | a, s) represents a state transition function, and b(s) represents a belief state before update.

Optionally, each blade state includes three-dimensional coordinates of two state points, the two state points are located on two sides of the blade, and the curved surface information includes three-dimensional coordinates of a plurality of points.

A second update submodule 3032, configured to:

and determining the difference value of the two state points in each scraper knife state and the corresponding state point on the target curved surface on the vertical axis based on the curved surface information and the state set to obtain the corresponding difference value of each scraper knife state.

And determining the scraper knife action corresponding to each scraper knife state based on the difference value corresponding to each scraper knife state, wherein the scraper knife action corresponding to any scraper knife state belongs to the action set.

And determining probability values of various types of scraper blade actions in the scraper blade actions corresponding to the scraper blade states based on the belief states.

And determining the updated observation action corresponding to any control strategy based on the probability values of any control strategy and various types of scraper knife actions.

Optionally, the determining submodule 3033 is configured to:

and determining a value function corresponding to any control strategy based on the state set, the observation action corresponding to any control strategy, the belief state, the reward function, the state transfer function, the observation function and the second formula.

The second formula includes:

wherein S represents a state set, O represents an observation set, S 'and S both represent the state of the cutting blade in the state set, a represents an observation action corresponding to any control strategy, O represents coordinate information in the observation set, b (S) represents a belief state, R (S) represents a reward function, gamma represents a discount factor, T (S' | a, S) represents a state transfer function, Ω (S ', a, O) represents an observation function, and V (S') represents a value function.

In summary, in the blade control apparatus provided in the embodiment of the present invention, the first obtaining module obtains the curved surface information of the target curved surface at the current time, and the second obtaining module obtains the target coordinate information of the mark point on the blade, the determining module determines the value function corresponding to each control strategy based on the blade control model and the obtained curved surface information and target coordinate information, and the control module controls the movement of the blade by using the control strategy corresponding to the maximum value function in the multiple control strategies. When the control strategy corresponding to the maximum function is adopted to control the movement of the scraper knife, the state of the scraper knife can be accurately controlled, so that the movement track of the scraper knife accords with the target curved surface, the control precision of the scraper knife is improved, the construction operation efficiency of the scraper knife is improved, and the engineering practical value of the scraper knife is improved.

In addition, the blade control model may include a Partially Observable Markov Decision Process (POMDP) model, and the Process of acquiring the target coordinate information by the second acquisition module includes: determining the focal length and the base line of a camera device arranged on the scraper knife, calculating the camera coordinates of the marking point at the current moment in the coordinate system of the camera device based on the focal length and the base line, and determining the coordinates of the marking point in the world coordinate system based on the position of the camera device, IMU data, the position of the positioning device and the camera coordinates to obtain the target coordinate information of the marking point. Therefore, on the basis of determining the state of the scraper knife in real time by using machine vision, the scraper knife control process based on partially observable Markov decision is established by using reinforcement learning, the state of the scraper knife can be predicted in real time, the optimal control strategy of the scraper knife is determined, the advanced control and closed-loop control of the scraper knife are realized, and the stability of the control process of the scraper knife is higher. In addition, the embodiment of the invention has lower application cost, so that the application universality is higher.

The blade control device provided by the embodiment of the invention can execute the flow of the blade control method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

The embodiment of the invention provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the scraper knife control method provided by the embodiment of the invention is realized when the processor executes the program.

Fig. 8 is a schematic structural diagram of a computer apparatus according to an embodiment of the present invention, as shown in fig. 8, the computer apparatus includes a processor 40, a memory 41, an input device 42, and an output device 43; the number of processors 40 in the computer device may be one or more, and one processor 40 is taken as an example in fig. 8; the processor 40, the memory 41, the input device 42 and the output device 43 in the computer apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 8.

The memory 41, as a computer-readable storage medium, may be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the blade control method in the embodiment of the present invention (e.g., the first obtaining module 301, the second obtaining module 302, the determining module 303, and the control module 304 in the blade control device 30). The processor 40 executes various functional applications of the computer device and blade control by running software programs, instructions, and modules stored in the memory 41, that is, implements any of the blade control methods described above.

The memory 41 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 41 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 41 may further include memory located remotely from processor 40, which may be connected to a computer device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 42 may be used to receive input numeric or character information (e.g., surface information and target coordinate information) and to generate key signal inputs relating to analyst settings and function control of the computer device. The output device 43 may include a display device such as a display screen.

Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement any one of the blade control methods provided in the embodiments of the present invention.

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly can be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

It should be noted that, in the embodiment of the blade control device, the included units and modules are only divided according to the functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method of blade control, the method comprising:

wherein the blade control model comprises a partially observable Markov decision process model, and the value function is an expected profit obtained when a blade state in a state set executes an action according to a control strategy;

controlling the movement of the scraper knife by utilizing a control strategy corresponding to a maximum function in the multiple control strategies;

each kind the control strategy corresponds to an observation action, the scraper knife control model further comprises: state set, behavior set, state transfer function, belief state, reward function, observation set and observation function; the state set comprises a plurality of scraper knife states, the action set comprises a plurality of scraper knife actions, the observation set comprises a plurality of coordinate information of the marking points, and the belief state comprises a plurality of probability values corresponding to the plurality of scraper knife states one to one.

2. The method of claim 1, wherein the blade has an imaging device and a positioning device mounted thereon; the obtaining of the target coordinate information of the mark point on the scraper knife at the current moment includes:

determining a focal length and a baseline of the camera device;

calculating the camera coordinates of the mark point at the current moment in a camera device coordinate system based on the focal length and the baseline;

determining coordinates of the mark point in a world coordinate system based on the position of the camera device, inertial Measurement Unit (IMU) data, the position of the positioning device and the camera coordinates;

and determining the coordinates of the marking points in the world coordinate system as the target coordinate information.

3. The method of claim 1, wherein determining the value function for each control strategy based on the blade control model, the surface information, and the target coordinate information comprises:

updating the belief state by using the state set, the observation action corresponding to any one of the plurality of control strategies, the target coordinate information, the belief state, the observation function and the state transfer function;

updating the observation action corresponding to any control strategy by using the curved surface information, the state set, the belief state and any control strategy, wherein the updated observation action corresponding to any control strategy belongs to the behavior set;

and determining a value function corresponding to any control strategy based on the state set, the observation action corresponding to any control strategy, the belief state, the reward function, the state transfer function and the observation function.

4. The method of claim 3, wherein said updating the belief state using the state set, the observed action corresponding to any of the plurality of control strategies, the target coordinate information, the belief state, the observation function, and the state transfer function comprises:

updating the belief state by using the state set, the observation action corresponding to any one of the control strategies, the target coordinate information, the belief state, the observation function, the state transfer function and a first formula;

the first formula includes:

wherein S represents the state set, S' and S both represent the blade state in the state set, and a represents any one of the control strategiesCorresponding observation action, o _t Represents the target coordinate information, b ' (s ') represents the updated belief state, Ω (s ', a, o) _t ) Represents the observation function, T (s' | a, s) represents the state transition function, b(s) represents the belief state before updating.

5. The method of claim 3, wherein each of the blade states includes three-dimensional coordinates of two state points located on either side of the blade, the curved information including three-dimensional coordinates of a plurality of points;

the updating the observation action corresponding to any control strategy by using the curved surface information, the state set, the belief state and any control strategy comprises the following steps:

determining the difference value of two state points in each scraper knife state and the corresponding state point on the target curved surface on the vertical axis based on the curved surface information and the state set to obtain the difference value corresponding to each scraper knife state;

determining a scraper action corresponding to each scraper state based on the difference corresponding to each scraper state, wherein the scraper action corresponding to any scraper state belongs to the action set;

determining probability values of various types of scraper blade actions in the scraper blade actions corresponding to the scraper blade states based on the belief states;

and determining updated observation actions corresponding to any control strategy based on the probability values of any control strategy and the various types of scraper blade actions.

6. The method of claim 3, wherein the determining a value function for the any control strategy based on the state set, the observation set, the observed action for the any control strategy, the belief state, the reward function, the state transfer function, and the observation function comprises:

determining a value function corresponding to any control strategy based on the state set, the observation action corresponding to any control strategy, the belief state, the reward function, the state transfer function, the observation function and a second formula;

the second formula includes:

wherein S represents the state set, O represents the observation set, S 'and S both represent blade states in the state set, a represents an observed action corresponding to any one of the control strategies, O represents coordinate information in the observation set, b (S) represents the belief state, R (S) represents the reward function, γ represents a discount factor, T (S' a, S) represents the state transition function, Ω (S ', a, O) represents the observation function, and V (S') represents the value function.

7. A blade control model, characterized in that the blade control model comprises a plurality of control strategies;

the control strategy corresponding to the maximum function is used for controlling the movement of the scraper knife;

8. A blade control apparatus, the apparatus comprising:

the scraper knife control model comprises a partially observable Markov decision process model, and the value function is expected income obtained when a scraper knife state in a state set executes actions according to a control strategy;

the control module is used for controlling the movement of the scraper knife by utilizing the control strategy corresponding to the maximum function in the multiple control strategies;

each kind the control strategy corresponds to an observation action, the scraper knife control model further comprises: state set, behavior set, state transfer function, belief state, reward function, observation set and observation function; the state set comprises a plurality of scraper knife states, the behavior set comprises a plurality of scraper knife actions, the observation set comprises a plurality of coordinate information of the mark points, and the belief states comprise a plurality of probability values corresponding to the plurality of scraper knife states one to one.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the blade control method according to any of claims 1-6 when executing the program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a blade control method according to any one of claims 1-6.