CN111882101A

CN111882101A - Control method based on supply chain system consistency problem under switching topology

Info

Publication number: CN111882101A
Application number: CN202010446158.7A
Authority: CN
Inventors: 李庆奎; 弓镇宇; 易军凯
Original assignee: Beijing Information Science and Technology University
Current assignee: Beijing Information Science and Technology University
Priority date: 2020-05-25
Filing date: 2020-05-25
Publication date: 2020-11-03

Abstract

The invention relates to a control method for the consistency problem of a supply chain system based on switching topology, which comprises the following specific steps: s1: firstly, a supply chain system is modeled as a multi-agent system and a product matching problem under uncertain market demands is summarized as an H-infinity consistency problem; s2: secondly, the H infinity consistency problem of discrete time is researched by using a game theory under the switching topology, and the process of inhibiting the ox penis effect can be regarded as a game process; s3: then, a double-loop strategy iterative algorithm is given to solve the decoupled HJI equation; s4: finally, a simulation example is given to prove the effectiveness of the method. The invention utilizes the distributed optimal control and the zero sum game theory to research the H infinity consistent problem of a multi-agent supply chain system based on a switching topology, the productivity and the market demand are considered as game participants, and the optimal productivity aiming at the worst disturbance situation is obtained by solving the HJI equation according to the zero sum game theory.

Description

Control method based on supply chain system consistency problem under switching topology

Technical Field

The invention relates to the technical field of supply chain systems, in particular to a control method based on a supply chain system consistency problem under a switching topology.

Background

A supply chain system is a network system that integrates a series of devices and business entities and includes a plurality of sub-supply chains. From a practical perspective, the new generation of technology of internet of things enables devices in a supply chain system to be equipped with computing capabilities, and thus such a system can be considered a multi-agent system. In recent years, with the development of information technology, a multi-agent based supply chain system has received more and more attention due to its wide application in research directions such as power systems, intelligent manufacturing systems, and intelligent transportation systems. In an enterprise, raw material processing, intelligent factories, logistics and customers are integrated into one cyber-physical system, and the control manner is converted from centralized control to distributed control, which improves production quality and efficiency. On the other hand, from a theoretical point of view, a multi-agent based supply chain system has self-organization, scalability, and coordination. Therefore, the supply chain system based on the intelligent agent has stronger robustness in the face of uncertain market demands, has better reliability in the face of information flow logistics interruption or production faults, and has high flexibility due to coordination design when node enterprises are subjected to increase and decrease changes, and a scientific and rigorous control method for the consistency problem of the supply chain system does not exist, so that a control method based on the consistency problem of the supply chain system under switching topology is provided.

Disclosure of Invention

The invention aims to solve the defects in the prior art, and provides a control method based on the consistency problem of a supply chain system under a switching topology.

In order to achieve the purpose, the invention provides the following technical scheme:

a control method based on a supply chain system consistency problem under switching topology comprises the following specific steps:

s1: first, the supply chain system is modeled as a multi-agent system and the product matching problem under uncertain market demand is resolved to the H ∞ consistency problem;

s2: secondly, the H infinity consistency problem of discrete time is researched by game theory under the switching topology, the process of inhibiting the ox penis effect can be regarded as the game process, the productivity and the market demand are respectively participants of the game, and the solution of the coupled HJI equation is needed to be obtained when the game is solved, singular terms can appear in the solution due to the characteristic of average consistency and the situation of appearance of isolated points is considered, so that a decoupling method is designed to indirectly obtain the solution of the HJI equation, and the optimal control strategy is proved to be at a Nash equilibrium point under certain specific conditions;

s3: next, a double-loop strategy iterative algorithm is given to solve the decoupled HJI equation;

s4: finally, a simulation example is given to prove the effectiveness of the proposed method.

Preferably, preliminary knowledge and problem modeling is required in step S1, first modeling the interaction between agents through graph theory, and then modeling based on the multi-agent supply chain system.

Preferably, in the step S2, the distributed zero sum game problem of the supply chain system, the H ∞ problem of the supply chain system is classified as solving a two-person zero sum game problem, the decoupling method is designed to simplify the calculation of the global HJI equation, and the proof that the system obtains H ∞ consistency at nash equilibrium under specific conditions is given, the decoupled HJI equation is solved by using a strategy iteration algorithm, and an optimal production strategy for uncertain demand and switching topology is obtained.

Preferably, the strategy iteration and decoupling HJI equation in step S3 further calculates the optimal productivity, and it is obvious to solve the lei-poonfov HJI equation by using the strategy iteration algorithm, where the strategy iteration algorithm includes an inner loop and an outer loop, and the inner loop performs strategy evaluation, where the production control and consistency protocol is fixed, the market demand gain is continuously updated until convergence, and the outer loop performs strategy update, and the production control and consistency protocol is updated.

Compared with the prior art, the invention has the beneficial effects that: the invention utilizes distributed optimal control and zero-sum game theory to research a class of H infinity consistent problem based on a multi-agent supply chain system with switching topology, productivity and market demand are considered as game participants, according to the zero-sum game theory, the optimal productivity aiming at worst-case disturbance is obtained by solving an HJI equation, and a decoupling method and a strategy iteration algorithm are further given to implement specific solving steps, and finally, a simulation example shows that the optimal production strategy designed according to the proposed method has good control performance and can realize the inhibition of the bullwhip effect.

Drawings

FIG. 1 is a schematic view of the production process of the ith subchain containing n devices according to the present invention;

FIG. 2 is a schematic diagram of the topology between agents and the internal structure of an ith agent in accordance with the present invention;

FIG. 3 is a schematic diagram of a switching signal according to the present invention;

FIG. 4 shows inventory level x under the switching topology zero sum gaming method of the present invention_i1A schematic diagram of variations;

FIG. 5 shows inventory level x under the switching topology zero sum gaming method of the present invention_i2A schematic diagram of variations;

FIG. 6 shows the error inventory level under the switching topology zero sum gaming method of the present invention_i1A schematic diagram of variations;

FIG. 7 shows the error inventory level under the switching topology zero sum gaming method of the present invention_i2A schematic diagram of variations;

FIG. 8 shows inventory level x under the switching topology H ∞ control method of the present invention_i1A schematic diagram of variations;

FIG. 9 shows inventory level x under the switching topology H ∞ control method of the present invention_i2A schematic diagram of variations;

FIG. 10 is a diagram illustrating the error inventory level under the switching topology H ∞ control method of the present invention_i1A schematic diagram of variations;

FIG. 11 shows the error inventory level under the switching topology H ∞ control method of the present invention_i2Schematic diagram of the variation.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1-11, the present invention provides a technical solution:

Preliminary knowledge and problem modeling:

A. theory of the drawings

Modeling interactions between agents with a graph G { V, E, a }, where V {1, …, N } is a finite, nonempty set and each agent can be viewed as a node, E ═ V × V is an edge set if node i can send information to node j, then (i, j) ∈ E, a ═ V_ij]An adjacency matrix representing a graph, for the ith and jth nodes in the graph, a when (i, j) ∈ E _ij1, otherwise a_ij0; neighbor set for agent i is N_iJ | j ∈ V, (i, j) ∈ E }. An orphan point is that the node has no neighbors and is also notIf there is at least one node in the graph that can reach any other node through a directed path, the graph is called to contain a directed spanning tree, and D is set to be diag (D)_i) Representing an in-degree matrix of a graph, wherein

Representing the degree of entry of node i, the laplacian matrix of the graph may be defined as L ═ D-a, the element L of the laplacian matrix_ijThe definition is as follows:

the laplace matrix of a symmetric graph is a symmetric matrix, the graph considered herein is a symmetric graph and does not contain self-loops;

we consider the case where there is and only one isolated point present at some time, and he can restore the connection with other child chains; the isolated point is not fixed, it changes over time, so the topology is time-varying; let the switching signal be σ: [0, ∞) → N ═ 0, 1., N } and σ (k) ═ h, which means that the h-th agent is an isolated point and no isolated point appears when σ (k) → 0; the time-varying topology may be denoted as G_σ(k)Drawing union { G_σ(k)K ∈ [0, ∞) } is G_uChanging the isolated points can change the neighbor sets of some sub-chains and the Laplace matrix of the graph, and can also affect the interaction among the sub-chains; time-varying neighbor set

Laplace matrix is L_hAnd L is₀L; the laplacian matrix elements when h ≠ 0 are:

for further discussion, the following assumptions are first introduced:

assume that 1: union of graphs { G_σ(k)K ∈ [0, ∞) } contains the spanning tree;

B. modeling based on a multi-agent supply chain system:

assuming that a supply chain system based on multi-agent modeling includes N sub-chains, which can be regarded as agents and also nodes in a topological graph, first, we discuss the internal operation process of the ith sub-chain, as shown in fig. 1, where there are N devices,

wherein the production rate, market demand and desired inventory level of the ith subchain may be expressed as:

from a control point of view, x_ij(k)，u_ij(k) Can be regarded as the state variable and the control input of the jth device in the ith sub-chain respectively, considering that each sub-chain can be regarded as a cascade system, wherein the control input is not only the input of the jth device but also the demand of the last device, in the real production activity, the production rate u_ij(k) Is non-negatively bounded, meaning 0 ≦ u_ij(k)≤u_m. Wherein u is_mIs the maximum production rate. The last device is directly connected to the market and customer needs directly affect it, and in order to measure and handle the market needs the following assumptions need to be given.

Assume 2: the demand consists of a known invariant demand and an unknown time variant demand, then the market demand can be expressed as:

in which the demand ω is not determined_in(k)∈l₂[0,∞]Can be considered as an external disturbance.

Assume that 3: in a supply chain system, a reasonable production plan can meet market demands.

In practice, the stock capacity is at an upper bound and the stored products in the warehouse are spoiled for various reasons, respectively by s_ijAnd ρ_jRepresenting the upper bound and the corruption rate of the jth warehouse store of the ith subcohain. The following assumptions can therefore be given.

Assume 4: the stock level satisfies 0 ≤ x_ij(k)≤s_ijAnd the product has a spoilage rate rho_jThe rate of deterioration occurs.

Here we can model the ith sub-chain according to assumptions 2-4:

x_i1(k+1)＝(1-ρ₁)x_i1(k)+u_i1(k)-u_i2(k)

x_i2(k+1)＝(1-ρ₂)x_i2(k)+u_i2(k)-u_i3(k)

x_in(k+1)＝(1-ρ_n)x_in(k)+u_in(k)-d_in(k)

therefore, the model of the ith subchain can be given by

x_L(k+1)＝(I-ρ)x_i(k)+Bu_i(k)-d_i(k) (4)

In the formula:

ρ＝diag{ρ₁，ρ₂，...，ρ_n}

and integrating the sub-chains into a global dynamic equation:

a commodity may be formed by combining a plurality of component parts, different sub-chains of a supply chain system may be responsible for different part processing, and reasonably planning the stock levels of different products is beneficial to reducing the stock cost while ensuring the quantity of output commodities, which indicates that the research for achieving consistency or matching the product stock quantity is indispensable, and in one sub-chain, the stock level of each warehouse device is directly or indirectly influenced by other devices, so that a desired storage level needs to be designed for each warehouse device according to market demands, and the desired storage level of the jth device of the ith sub-chain is recorded as c_ijIt is clear that the size of this variable is within the warehouse's allowable storage range, so the product quantity matching problem implies a state x_ij(k) To achieve the desired value, it is necessary to introduce the following definitions before further discussion.

Definition 1: for a multi-agent system containing N agents, achieving consistency means that the individual agent states in the system satisfy at any initial state:

the storage level reaching the desired value is actually a solution to the state tracking problem, which can be seen as an extension of the consistency problem.

To achieve consistency in the supply chain system, the production rate of each child chain should be appropriately adjusted according to the inventory levels of its neighbors, which indicates that a reasonable consistency protocol needs to be designed, we assume that the production rate contains two parts,

is responsible for the control of the production,

is in charge of the consistency protocol and is,since the topology changes depending on the switching signal σ (k) ═ h, the controller can be designed to:

wherein

And is

Respectively represent a production control gain and a uniformity gain, and

c_i＝[c_i1,c_i2,…,c_in]^T

we can derive the global control law as:

wherein

Note 1: the coherence protocol is distributed and topology dependent, and the state information received from other agents is incomplete when soliton nodes are present.

Recording the average state as

Error vector formed by

Given, we can therefore derive:

then, the global error dynamics of the supply chain system can be normalized as:

wherein:

note 3: known constant demand can be directly eliminated by production control, so we only need to consider the effect of unknown demand on the supply chain system, and time-varying demand will cause the bullwhip effect, but can suppress its effect by solving the H ∞ consistency problem.

Note 4: symmetrical drawing of laplaThe gaussian matrix is also symmetric, so in combination with the laplace matrix property one can derive: l is_hM＝ML_h＝L_h。

Order to

The error system can therefore be rewritten as follows:

written as follows:

wherein z (k) is the system output, and

and

can be represented by the following formula:

solving the H-infinity consistency problem requires designing the consistency protocol and production control so that the error system is in

The time is gradually stabilized and the following suppression conditions are satisfied.

Definition 2: let gamma be^*Is the lower bound of the disturbance suppression level, for non-zero

And a bounded function beta, the system (10) is l₂The gain is bounded and is less than or equal to a given positive scalar γ, where γ ≧ γ^*。

Distributed zero-sum gaming problems for supply chain systems:

in this section, the H ∞ problem of the supply chain system is summarized as solving the two-person zero-sum game problem, and in order to simplify the computation of the global HJI equation, this section designs a decoupling method and gives proof that the system achieves H ∞ consistency at nash equilibrium under specific conditions, and furthermore, we adopt a strategy iteration algorithm to solve the decoupled HJI equation and obtain an optimal production strategy for uncertain demand and switching topology.

For convenience of writing, let x (k) be x_kUnder the switching topology, an infinite interval performance function is defined for the supply chain system as follows:

of formula (II) Q'_h≥0，

T>0 is an adaptive symmetric matrix, and the switching signal is σ (k) ═ h. Matrix Q'_hDepending on the topology of the graph, it also means that it is associated with the Laplace matrix L_hCorrelation, the corresponding value function can be defined as:

the H ∞ consistency problem can be reduced to the zero and differential gaming problems, the production control and consistency protocol minimizes the performance function, and the uncertain market demand maximizes the performance function, so the gaming process can be expressed as:

if a saddle point exists in the game

It has only one solution

It is equivalent to nash equilibrium conditions:

from the bellman optimality principle and equation (14), the bellman equation can be derived:

consider a quadratic form of the value function:

substituting (19) into (18) yields:

the Hamiltonian is then:

through the stable conditions:

the optimal production control, optimal consistency protocol and worst case disturbances are available, hence:

in the formula

Wherein the matrix pair

Is capable of being calmed down and has the advantages of,

is detectable and for an optimal coherence protocol there exists a matrix G such that:

note 5: the matrix G provides greater freedom for uniform design, in this context, the Laplace matrix L_hIs singular and therefore

Is a singular gain matrix, whereas when G is 0, the unity gain matrix in (24) is non-singular, apparently with

Contradictory, so for the sake of the following discussion, the addition of the non-zero matrix G is chosen here.

Moreover, substituting (20) production control (22), consistency protocol (23), and market demand (24) may result in a global HJI equation:

wherein:

note 6: matrix Q'_hAccording to the topology change, thus resulting in the solution P 'of the HJI equation'_hAnd also varied, whereby the production rate at which isolated nodes occur can be designed.

The error state based feedback control law is shown in (11), so the optimal productivity under the switching topology is related to the switching topology, assuming that

And the solution of the HJI equation satisfies

Combining (22) to obtain:

similarly, the optimal coherency protocol is:

suppose that

And the worst-case market demand has the form:

in the formula

Represents the worst case market demand gain, and therefore

And 7, note: through the theory of zero-sum gaming, optimal productivity can suppress the bullwhip effect produced by worst-case disturbances. If the production rate of a supply chain system design can achieve consistency under worst case disturbances, agreement can also be reached when addressing other market needs.

To decouple the HJI equation, the following arguments apply.

Introduction 1: consider a multi-agent based supply chain error dynamic system (10), assuming

G₁＝(L_h-I_N)P₁，

And Q'_hSatisfies the following conditions:

in the formula

In relation to the switching signal, solving the HJI equation is equivalent to solving the following equation:

in the formula:

and (3) proving that: value function and matrix

Is correlated and the weight matrix is in the form of

Will be provided with

Substituting (26) and (29), respectively, can obtain:

order to

Combining (27) to obtain:

thus, HJI equation (25) is equivalent to:

wherein:

selecting

The following can be obtained:

where the matrix is symmetric, thus:

if weight matrix Q'_hSatisfies the following conditions:

in the formula

The global HJI equation can be decoupled as:

in the formula

And has:

thus, P 'can be obtained by solving (31) and further'_h

Note 8: matrix Q_hAssociated with the switching signal, so that different solutions P can be derived from different topologies_hTopology information is hidden, and the optimal state feedback gain under the switching topology is further obtained.

Assume that 5: for some matrices S, the weight matrix Q_h＝S^TL_hAnd S. And S may be a pair of matrices

Can be detected.

B.l₂Gain boundedAnd Nash equilibrium

In this subsection, we demonstrate that a set production rate can suppress the bullwhip effect at a level while achieving consistency of the supply chain system, and that the solution of the HJI equation satisfies the nash equilibrium condition, first introducing the following reasoning:

2, leading: suppose V^*Is a positive solution of the HJI equation, then:

theorem 1: assuming that the suppression level of the bullwhip effect satisfies gamma>γ^*And the HJI equation has a smooth positive solution. The supply chain system will achieve H ∞ consistency under optimal production control and optimal consistency protocol, where market demand is satisfied

And (3) proving that: substituting the solution of the HJI equation into the hamiltonian (21) yields:

in the formula

Order to

We have:

furthermore, there is a need for the market

(36) can be formed as:

thus, the equalization point of the error system (10) by the Lyapunov theorem is progressively stabilized. Taking into account the constraint condition (12) and theorem 2, adding up both sides of the inequality (36) yields:

wherein

If N → ∞, then there are:

thus, in connection with definition 2, it is clear that the supply chain system is capable of suppressing the bullwhip effect at a level and achieving consistency.

Definition 2: assuming that the suppression level of the bullwhip effect satisfies gamma>γ^*And the HJI equation has a smooth positive solution

When the supply chain system reaches consistency, when G₄Satisfy the requirement of

Then

For Nash equilibrium and optimal game solution

And (3) proving that: for the smoothing value function V: (_k) The performance function (13) can be rewritten as:

suppose V^*(_k) Can be derived by solving the HJI equation, and the optimal strategy is

The preparation method can obtain:

the supply chain error system progressively stabilizes at the equilibrium point, which indicates that the supply chain achieves H ∞ consistency.

Thus is provided with

And is

Is obviously provided with

When in use

The performance function (38) may be written as:

suppose that

Then there are:

if G is₄Satisfies the inequality (37), then

Further satisfying the nash equalization condition (17), at the equalization point, one obtains:

wherein the optimal game is solved as

Strategy iteration and decoupling HJI equation:

to further calculate the optimal productivity, a strategic iterative algorithm is used to solve the Lyapunov-like HJI equation (31)

It is evident that the strategy iteration algorithm contains an inner loop and an outer loop. The inner loop performs strategic evaluation, wherein the production control and consistency protocols are fixed, the market demand gain is continuously updated through (33) until convergence, and the outer loop performs strategic update, updating the production control and consistency protocols through (32) and (34).

Note 9: the decoupled HJI equation is a non-linear equation and still difficult to directly solve, so a reinforcement learning algorithm is adopted to obtain a locally converged solution, and therefore, optimal production control, an optimal consistency protocol and worst market requirements can be achieved through the algorithm.

Assume that there are three sub-supply chains for a region, each supply chain including two devices that are responsible for production and sale, respectively. Assuming that the ith sub-chain is disconnected from other sub-chains within a certain period of time, i.e. the ith agent becomes an isolated node, fig. 2 is a possible topology and a corresponding laplace matrix;

assuming that the goods in the warehouses in the three sub-chains deteriorate at the same rate, the deterioration rate matrix is as follows:

given the initial state of each child chain as x₁₍₀₎＝[6,7]^T，x₂₍₀₎＝[3.5,2]^T，x₃₍₀₎＝[2.5,3]^TThe expected state value is c₁＝[2,2]^T，c₂＝[2.5,2.5]^T，c₃＝[3,3]^T. The appropriate weight matrix is selected and substituted into the HJI equation to obtain the optimal production control strategy, the optimal consistency protocol and the worst case requirements, and the switching signals are as shown in fig. 3.

It can be seen from fig. 4 and 5 that the status of each sub-chain approaches the respective expected status value, and the stock status fluctuates during the occurrence of the isolated point because the status information interaction is interrupted, and the status of the supply chain returns to the expected value when the system resumes the connection, and further, fig. 6 and 8 show that the average consistency error of the supply chain approaches zero, so that the three sub-supply chains can reach the expected stock level when the isolated point increases or decreases, which means that the system can achieve consistency in the switching topology, and the production rate can suppress the bullwhip effect caused by uncertain market demand, therefore, the supply chain system can achieve H ∞ consistency, and further, by continuously reducing the value of γ, the minimum suppression level γ can be searched^*＝0.346。

By contrast, a standard H ∞ control method is also implemented here, with simulation results as shown in FIGS. 8-11, and similarly suppliedThe chain system can achieve consistency and reduce the bullwhip effect, and the minimum inhibition level is gamma^*0.534, a comprehensive consideration of the two methods can make the system realize H-infinity coincidence, but there are still significant differences, such as shown in fig. 1, that the suppression level of the zero-sum game method is lower than that of the H-infinity control method, and the drastic change of the state curve in the H-infinity control can be observed, so that the zero-sum game method has better uncertainty suppression capability, and in addition, the stabilization time of the dynamic response of the system under the zero-sum game method is less than that of the standard H-infinity control.

Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that various changes in the embodiments and/or modifications of the invention can be made, and equivalents and modifications of some features of the invention can be made without departing from the spirit and scope of the invention.

Claims

1. A control method based on supply chain system consistency problem under switching topology is characterized in that: the method comprises the following specific steps:

s2: secondly, the H infinity consistency problem of discrete time is researched by game theory under the switching topology, the process of inhibiting the ox penis effect can be regarded as the game process, the productivity and the market demand are respectively participants of the game, and the solution of the coupled HJI equation is needed to be obtained when the game is solved, singular terms can appear in the solution due to the characteristic of average consistency and the situation of appearance of isolated points is considered, so that a decoupling method is designed to indirectly obtain the solution of the HJI equation, and the optimal control strategy is proved to be at a Nash equilibrium point under certain conditions;

2. The method for controlling supply chain system consistency problem based on switching topology as claimed in claim 1, wherein: preliminary knowledge and problem modeling is required in said step S1, first modeling the interaction between agents by graph theory and then modeling based on the multi-agent supply chain system.

3. The method for controlling supply chain system consistency problem based on switching topology as claimed in claim 1, wherein: in the step S2, the H ∞ problem of the supply chain system is summarized as solving the two-person zero sum game problem, the decoupling method is designed to simplify the calculation of the global HJI equation, and the proof that the system obtains H ∞ consistency at nash equilibrium under the condition is given, the decoupled HJI equation is solved by using a strategy iteration algorithm, and an optimal production strategy for uncertain demand and switching topology is obtained.

4. The method for controlling supply chain system consistency problem based on switching topology as claimed in claim 1, wherein: the strategy iteration and decoupling HJI equation in the step S3, calculating the optimal productivity, and solving the lei apunov-like HJI equation by using the strategy iteration algorithm obviously shows that the strategy iteration algorithm comprises an inner ring and an outer ring, the inner ring executes strategy evaluation, wherein the production control and consistency protocol is fixed, the market demand gain is continuously updated until convergence, the outer ring executes strategy update, and the production control and consistency protocol is updated.