CN117970817B - Nonlinear electromechanical system on-line identification and robust control method and device - Google Patents

Nonlinear electromechanical system on-line identification and robust control method and device Download PDF

Info

Publication number
CN117970817B
CN117970817B CN202410372235.7A CN202410372235A CN117970817B CN 117970817 B CN117970817 B CN 117970817B CN 202410372235 A CN202410372235 A CN 202410372235A CN 117970817 B CN117970817 B CN 117970817B
Authority
CN
China
Prior art keywords
differential game
initial
differential
interference
control input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410372235.7A
Other languages
Chinese (zh)
Other versions
CN117970817A (en
Inventor
张斌
张语琦
杨辉华
赵文义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202410372235.7A priority Critical patent/CN117970817B/en
Publication of CN117970817A publication Critical patent/CN117970817A/en
Application granted granted Critical
Publication of CN117970817B publication Critical patent/CN117970817B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The application provides a nonlinear electromechanical system on-line identification and robust control method and device, comprising the following steps: taking the controller as a participant of the minimized cost function, taking the system interference as a participant of the maximized cost function, and establishing a system description form based on two-person zero and differential game; introducing a neural network into a system description form based on two-person zero and differential game, and establishing a parameterized identification form of an uncertain model; identifying unknown parameters in the uncertain model by utilizing online instantaneous data and offline historical data and adopting a parallel learning technology; based on the identified parameters, the solution of the differential countermeasure is learned through a self-adaptive iterative algorithm for solving saddle points, and the optimal robust controller of the system under the worst interference condition is obtained. The application expands the traditional Pontrisia maximum and minimum principle, combines the parameter identification and the maximum and minimum principle, and effectively improves the control quality.

Description

Nonlinear electromechanical system on-line identification and robust control method and device
Technical Field
The application relates to the field of control theory and artificial intelligence, in particular to a nonlinear electromechanical system on-line identification and robust control method and device.
Background
Differential gaming can address continuous conflict, competition, and collaboration problems of two or more parties under differential equation constraints. Two-player zero and differential gaming is a classical differential gaming problem in that the gain of one party results in the loss of the other party. The two-person zero-and-differential game has important research significance in the control field, provides a general framework for solving the problem of robust control, and regards a controller and interference as two virtual participants, wherein the controller wants to minimize a cost function, and the interference wants to maximize the cost function. The goal is to construct the best robust controller so that the worst interference effects the system within acceptable limits. Two main methods exist for solving the two-person zero and differential game problem: dynamic programming and Pontrisia maximum and minimum principle.
However, in engineering practice, as the process, structure, operation environment and sensing mode of the complex nonlinear electromechanical system become more and more complex, different modeling errors are ubiquitous, so that the robust controller based on differential game cannot obtain good control quality. How to utilize the off-line or on-line data of the complex nonlinear electromechanical system to construct a robust controller while realizing the accurate on-line identification of parameters is an effective technical approach for overcoming modeling errors and improving the performance of the controller.
Disclosure of Invention
The present application aims to solve at least one of the technical problems in the related art to some extent.
Therefore, a first objective of the present application is to provide an online identification and robust control method for a nonlinear electromechanical system, so as to overcome the shortcomings of the prior art.
A second objective of the present application is to provide an online identification and robust control device for a nonlinear electromechanical system.
A third object of the present application is to propose an electronic device.
A fourth object of the present application is to propose a computer readable storage medium.
A fifth object of the application is to propose a computer programme product.
To achieve the above objective, an embodiment of a first aspect of the present application provides an online identification and robust control method for a nonlinear electromechanical system, including:
taking the controller as a participant of the minimized cost function, taking the system interference as a participant of the maximized cost function, and establishing a system description form based on two-person zero and differential game;
Introducing a neural network into a system description form based on two-person zero and differential game, and establishing a parameterized identification form of an uncertain model;
Identifying unknown parameters in the uncertain model by utilizing online transient data and offline historical data and adopting a parallel learning technology;
Based on the identified parameters, the solution of the differential countermeasure is learned through a self-adaptive iterative algorithm for solving saddle points, and the optimal robust controller of the system under the worst interference condition is obtained.
Optionally, the system description form based on the two-person zero and differential game is established, which comprises the following steps:
establishing a differential game model taking into account control inputs and system disturbances, formulated as:
Wherein, Is the system state,/>Is a control input,/>Is system interference,/>Is a system drift dynamics function,/>And/>Is a system input gain matrix,/>Is the initial moment of differential game,/>Is the initial state of differential game;
to represent the maximum magnitude of control input and disturbance, a set of combinations is defined And/>Wherein, the method comprises the steps of, wherein,
Wherein,Is the control input/>Is interference/>
For the differential game model, the following non-quadratic cost function is adopted, and the expression is as follows:
Wherein the method comprises the steps of Is a finite time interval, state function/>And/>And/>Defined as the following non-negative integral function:
Wherein, And/>Are all integral arguments, matrix/>、/>、/>、/>Are positive definite matrices and:
based on the purpose of two-person zero and differential game, according to the collection And/>All/>And/>Determining saddle pointsSo that the following inequality holds, formulated:
Wherein, Is a robust control input,/>Is the worst interference, saddle/>, according to the Pontrisia maximum and minimum principle, of the differential game modelThe following set of equations is satisfied:
Wherein, Is covariates in the Pontrian maximum and minimum principle,/>Hamiltonian is defined as/>
Optionally, the introducing the neural network establishes a parameterized identification form of the uncertain model, including:
The system drift dynamics function and the system input gain matrix which are approximately unknown are expressed as follows:
Wherein, ,/>,/>Respectively have/>, for the hidden layer,/>,/>Unknown weighting matrix of neural network of individual neurons,/>,/>,/>Is an activation function,/>,/>Is an approximation error;
when the number of neurons approaches infinity and the approximation error converges to zero, the differential game model is converted according to the approximation expression, and the differential game model is obtained:
Wherein, To augment the weight matrix,/>To augment the activation function,/>Is the lumped estimation error, and the lumped estimation error/>The agreement converges to zero.
Optionally, the identifying the unknown parameters in the uncertain model by using online transient data and offline historical data and adopting a parallel learning technology includes:
Identifying the unknown augmented weight matrix using a data driven approach At time intervals of the cost functionIn the interior, a series of time points are selected, and the expression form is/>
Will beData points/>Is defined as the historical stack dataset/>Data point/>Is at the time point/>The set of system states, control inputs, system disturbances and state derivatives under, define the correlation matrix as/>During data acquisition, the incidence matrix needs to meet the condition/>I.e./>
System state and the first of the augmented weight matricesThe secondary estimates are denoted/>, respectivelyAnd/>Estimating a value/>, using the historical stack datasetAnd/>The adaptive update law of (a) is designed as follows:
Wherein, And/>Is a positive parameter,/>Is the system state/>The secondary estimation error is used to estimate the error,/>, For the augmented weight matrixAnd estimating errors once.
Optionally, the learning of the solution of the differential countermeasure by the adaptive iterative algorithm for solving the saddle point based on the identified parameter obtains an optimal robust controller of the system under the worst interference condition, including:
S1: selecting initial parameters Designating convergence accuracy within an engineering allowable range, acquiring an initial state estimated value and an initial augmentation weight matrix estimated value within a limited time interval, and calculating an initial cost, wherein the initial cost is formulated as follows:
Wherein, For the collection/>And/>Is provided with an initial control input and a system disturbance,For initial state estimation value,/>For initial weight matrix estimation value,/>At the initial cost,/>An initial measurement state of an original differential game model;
s2: regarding the control input as one of the differential games, iteratively updating the control input, formulated as:
Wherein, ,/>
S3: calculating a cost function after updating the control input, judging whether the updated control input reduces the cost function according to the principle that the control input needs to minimize the cost function, and formulating into:
Wherein, Is the original differential game model at/>Under measurement conditions, ifLet/>,/>Wherein/>And/>Is a positive iteration step and returns to S2, otherwise enter S4;
S4: regarding the system interference as the other party in the differential game, iteratively updating the system interference, and formulating as:
Wherein, ,/>
S5: updating the estimated value of the augmented weight matrix, and formulating as:
Wherein, ,/>,/>Is the original differential game model at/>A lower measurement state;
s6: calculating and updating a cost function after system interference, judging whether the updated system interference increases the cost function according to the principle that the system interference needs to be maximized, and formulating as:
If it is Let/>And/>Wherein/>And/>Is a positive iteration step and returns to S4, otherwise enter S7;
s7: judging whether the convergence of the control input meets the precision requirement, if so S8 is entered, otherwise, let/>,/>,/>,/>,/>And returns to S2;
s8: judging whether convergence of the estimated value of the augmentation weight matrix meets the precision requirement, if so Stopping the iterative process to obtain the optimal robust controller of the system under the worst interference condition, otherwise, setting/>,/>,/>,/>And returning to S2, the iterative process continues.
To achieve the above object, a second aspect of the present application provides an online identification and robust control device for a nonlinear electromechanical system, comprising:
the first establishing module is used for taking the controller as a participant of the minimized cost function, taking the system interference as a participant of the maximized cost function, and establishing a system description form based on two-person zero and differential game;
The second building module is used for introducing a neural network into the system description form based on the two-person zero and differential game to build a parameterized identification form of the uncertain model;
The identification module is used for identifying unknown parameters in the uncertain model by utilizing online instantaneous data and offline historical data and adopting a parallel learning technology;
And the solving module is used for learning a solution of the differential countermeasure through a self-adaptive iterative algorithm for solving the saddle point based on the identified parameters to obtain the optimal robust controller of the system under the worst interference condition.
To achieve the above object, an embodiment of a third aspect of the present application provides an electronic device, including: a processor, and a memory communicatively coupled to the processor;
The memory stores computer-executable instructions;
The processor executes computer-executable instructions stored by the memory to implement the method of any one of the first aspects above.
To achieve the above object, an embodiment of a fourth aspect of the present application proposes a computer-readable storage medium having stored therein computer-executable instructions for implementing the method according to any of the above first aspects when being executed by a processor.
To achieve the above object, an embodiment of a fifth aspect of the present application proposes a computer program product comprising a computer program which, when executed by a processor, implements a method as described in any of the above first aspects.
The technical scheme provided by the embodiment of the application at least has the following beneficial effects:
Establishing a system description form based on two-person zero and differential game by taking the controller as a participant of a minimized cost function and taking system interference as a participant of a maximized cost function; introducing a neural network into a system description form based on two-person zero and differential game, and establishing a parameterized identification form of an uncertain model; identifying unknown parameters in the uncertain model by utilizing online transient data and offline historical data and adopting a parallel learning technology; based on the identified parameters, the solution of the differential countermeasure is learned through a self-adaptive iterative algorithm for solving saddle points, and the optimal robust controller of the system under the worst interference condition is obtained. The traditional Pontrisia maximum and minimum principle is expanded, and the parameter identification and the maximum and minimum principle are combined, so that the control quality is effectively improved, and the requirements of actual engineering are met.
Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.
Drawings
The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a flow chart of a nonlinear electromechanical system on-line identification and robust control method according to an embodiment of the present application;
figure 2 is a schematic view of a four-rotor drone shown according to an embodiment of the present application;
FIG. 3 is a flow chart of an adaptive iterative algorithm shown in accordance with an embodiment of the present application;
FIG. 4 is an initial control input and system disturbance map shown in accordance with an embodiment of the present application;
FIG. 5 is an initial state trace diagram shown in accordance with an embodiment of the present application;
FIG. 6 is a graph of input and disturbance cost changes, according to an embodiment of the application;
FIG. 7 is a graph of a change in cost function according to an embodiment of the application;
FIG. 8 is a diagram illustrating systematic parameter identification errors according to an embodiment of the present application;
FIG. 9 is a diagram illustrating system state and estimated saddle point changes according to an embodiment of the present application;
FIG. 10 is a flow chart of a nonlinear electromechanical system on-line identification and robust control method according to an embodiment of the present application;
Fig. 11 is a block diagram of an electronic device.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present application and should not be construed as limiting the application.
An on-line identification and robust control method and apparatus for a nonlinear electromechanical system according to an embodiment of the present application are described below with reference to the accompanying drawings.
FIG. 1 is a flow chart of a nonlinear electromechanical system on-line identification and robust control method according to an embodiment of the present application, as shown in FIG. 1, the method comprises the following steps:
Step 101, taking the controller as a participant of minimizing the cost function, taking the system interference as a participant of maximizing the cost function, and establishing a system description form based on the two-person zero and differential game.
In the embodiment of the application, firstly, a differential game model which considers control input and system interference is established, and the differential game model is expressed as:
Wherein, Is the system state,/>Is a control input,/>Is system interference,/>Is a system drift dynamics function,/>And/>Is a system input gain matrix,/>Is the initial moment of differential game,/>Is the differential game initial state.
To represent the maximum magnitude of control input and disturbance, a set of combinations is definedAnd/>Wherein, the method comprises the steps of, wherein,
Wherein,Is the control input/>Is interference/>
For a differential game model, the application adopts the following non-quadratic cost function, and the expression is:
Wherein the method comprises the steps of Is a finite time interval, state function/>And/>And/>Defined as the following non-negative integral function:
Wherein, among them, And/>Are all integral arguments, matrix/>、/>、/>、/>Are positive definite matrices and:
It will be appreciated that the purpose of two-person zero and differential gaming is for a collection And/>All/>And/>Find determination saddle/>So that the following inequality holds, formulated:
Wherein, Is a robust control input,/>Is the worst interference, known from the traditional Pontrisia maximum and minimum principle, saddle/>, of differential game modelThe following set of equations is satisfied:
Wherein, Is covariates in the Pontrian maximum and minimum principle,/>Hamiltonian is defined as/>
In order to more clearly and in detail describe the system description form of the two-person zero-differential game provided by the application, in some embodiments, a four-rotor unmanned aerial vehicle gesture on-line identification and robust control example driven by four motors is provided, and the four-rotor unmanned aerial vehicle is shown in fig. 2.
The four-rotor unmanned aerial vehicle attitude dynamics equation is:
Wherein, Is the arm length of the unmanned aerial vehicle,/>Is unmanned aerial vehicle quality,/>,/>,/>Is a triaxial inertial component,/>,/>,/>Is the three-axis air resistance coefficient,/>,/>,/>Is a triaxial control moment,/>,/>,/>Is system interference. The range of system interference is defined as/>,/>. The triaxial control moment is generated by the rotation of four motors, and the specific expression is as follows:
Wherein the method comprises the steps of Is the thrust coefficient,/>Is the drag coefficient. The angular speed limit of the four motors is/>
In addition, defineThe attitude dynamics equation can be written as:
in one possible embodiment, the values of the parameters of the system are selected as shown in the following table, and table 1 is:
TABLE 1
Definition of,/>,/>,/>And/>Then the problem of robust control of the attitude of the quadrotor unmanned aerial vehicle in this embodiment is to find a control input/>, which meets the constraintWhen/>At this time, for all bounded disturbances/>The method comprises the following steps:
Wherein, Is a time constant,/>Is a specified performance index.
Then, the embodiment of the present application further defines:
It will be appreciated that the cost function is Problem transformation into solution cost functionSaddle dot/>,/>Is a robust control input,/>Is the worst interference.
Step 102, introducing a neural network into a system description form based on the two-person zero and differential game, and establishing a parameterized identification form of the uncertain model.
In the embodiment of the application, the approximate unknown drift kinetic function of the system and the system input gain matrix are expressed as follows:
Wherein, ,/>,/>Respectively have/>, for the hidden layer,/>,/>Unknown weighting matrix of neural network of individual neurons,/>,/>,/>Is an activation function,/>,/>Is an approximation error.
Further, when the number of neurons approaches infinity and the approximation error converges to zero, the differential game model is converted according to the approximation expression, and the method is obtained:
Wherein, To augment the weight matrix,/>To augment the activation function,/>Is the lumped estimation error, and the lumped estimation error/>The agreement converges to zero.
As a possible implementation manner, the selected single hidden layer neural network node is:
Wherein, define For an unknown weight matrix, define/>For/>Is used for the estimation of the estimated value of (a).
And 103, identifying unknown parameters in the uncertain model by using the online instantaneous data and the offline historical data and adopting a parallel learning technology.
In the embodiment of the application, an unknown amplification weight matrix is identified by adopting a data driving modeAt time interval/>, of the cost functionIn the interior, a series of time points are selected, and the expression form is/>,/>
Then, willData points/>Is defined as a historical stack datasetData point/>Is at the time point/>The set of system states, control inputs, system disturbances and state derivatives under, define the correlation matrix as/>During data acquisition, the incidence matrix needs to meet the condition/>I.e./>
It can be appreciated that the system state and the first of the augmented weights are matrixThe secondary estimates are denoted/>, respectivelyAndThen estimate/>, using the historical stack datasetAnd/>The adaptive update law of (a) is designed as follows:
Wherein, And/>Is a positive parameter,/>Is the system state/>The secondary estimation error is used to estimate the error,To augment the weight matrixAnd estimating errors once.
In one possible embodiment, a historical stack dataset of 40 sample points is collected as shown in tables 2-6, each listing 8 sets of data.
TABLE 2
TABLE 3 Table 3
TABLE 4 Table 4
TABLE 5
TABLE 6
And 104, learning a solution of a differential countermeasure by a self-adaptive iterative algorithm for solving saddle points based on the identified parameters, and obtaining the optimal robust controller of the system under the worst interference condition.
In the embodiment of the application, an adaptive iterative algorithm for solving saddle points is provided to learn the solution of differential countermeasures.
Specifically, as shown in fig. 3, the specific implementation steps of the adaptive iterative algorithm are as follows:
S1: selecting initial parameters Designating convergence accuracy within an engineering allowable range, acquiring an initial state estimated value and an initial augmentation weight matrix estimated value within a limited time interval, and calculating an initial cost, wherein the initial cost is formulated as follows:
Wherein, For the collection/>And/>Is provided with an initial control input and a system disturbance,For initial state estimation value,/>For initial weight matrix estimation value,/>At the initial cost,/>An initial measurement state of an original differential game model;
s2: regarding the control input as one of the differential games, iteratively updating the control input, formulated as:
Wherein, ,/>
S3: calculating a cost function after updating the control input, judging whether the updated control input reduces the cost function according to the principle that the control input needs to minimize the cost function, and formulating into:
Wherein, Is the original differential game model at/>Under measurement conditions, ifLet/>,/>Wherein/>And/>Is a positive iteration step and returns to S2, otherwise enter S4;
S4: regarding the system interference as the other party in the differential game, iteratively updating the system interference, and formulating as:
Wherein, ,/>
S5: updating the estimated value of the augmented weight matrix, and formulating as:
Wherein, ,/>,/>Is the original differential game model at/>A lower measurement state;
s6: calculating and updating a cost function after system interference, judging whether the updated system interference increases the cost function according to the principle that the system interference needs to be maximized, and formulating as:
If it is Let/>And/>Wherein/>And/>Is a positive iteration step and returns to S4, otherwise enter S7;
s7: judging whether the convergence of the control input meets the precision requirement, if so S8 is entered, otherwise, let/>,/>,/>,/>,/>And returns to S2;
s8: judging whether convergence of the estimated value of the augmentation weight matrix meets the precision requirement, if so Stopping the iterative process to obtain the optimal robust controller of the system under the worst interference condition, otherwise, setting/>,/>,/>,/>And returning to S2, the iterative process continues.
In a possible embodiment, the control input and the disturbance boundary are set to,/>,/>,/>The performance index is set to/>. Randomly selecting initial estimated value/>, of the augmented weight matrixThe initial condition of the iterative algorithm is set to/>,/>
As a possible implementation, the initial control input and the interference signal are selected as shown in fig. 4, wherein,/>. Input/>The lower initial trace is shown in fig. 5. The initial identification error of the system can be calculated as/>The initial cost is/>. Iterative computation is carried out according to the algorithm flow chart 3, and after 20 iterations, the iteration is carried out for/>And/>The change curve of (2) is shown in FIG. 6, the cost functionThe change curve of (2) is shown in FIG. 7.
It can be seen that the cost function converges to zero after 20 iterations, so that the robust control condition holds.
Fig. 8 shows the variation of the systematic identification error over 20 iterations.
It can be seen that as the number of iterations increases, the recognition error decreases toBetween them.
In addition, FIG. 9 reflects an iterative process of system state and saddle point estimation.
According to the embodiment of the application, the controller is taken as a participant of the minimized cost function, the system interference is taken as a participant of the maximized cost function, and a system description form based on two-person zero and differential game is established; introducing a neural network into a system description form based on two-person zero and differential game, and establishing a parameterized identification form of an uncertain model; identifying unknown parameters in the uncertain model by utilizing online instantaneous data and offline historical data and adopting a parallel learning technology; based on the identified parameters, the solution of the differential countermeasure is learned through a self-adaptive iterative algorithm for solving saddle points, and the optimal robust controller of the system under the worst interference condition is obtained. The traditional Pontrisia maximum and minimum principle is expanded, and the parameter identification and the maximum and minimum principle are combined, so that the control quality is effectively improved, and the requirements of actual engineering are met.
Fig. 10 is a block diagram of a nonlinear electromechanical system on-line recognition and robust control apparatus 10 according to an embodiment of the present application, including:
A first establishing module 100, configured to take the controller as a participant of minimizing the cost function, take the system interference as a participant of maximizing the cost function, and establish a system description form based on two-person zero and differential game;
The second establishing module 200 is used for introducing a neural network into the system description form based on the two-person zero and differential game to establish a parameterized identification form of the uncertain model;
The identification module 300 is configured to identify unknown parameters in the uncertain model by using online transient data and offline historical data and adopting a parallel learning technology;
The solving module 400 is configured to learn a solution of the differential countermeasure by an adaptive iterative algorithm for solving saddle points based on the identified parameters, so as to obtain an optimal robust controller of the system under the worst interference condition.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
FIG. 11 shows a schematic block diagram of an example electronic device 700 that may be used to implement an embodiment of the application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.
As shown in fig. 11, the apparatus 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 may also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in device 700 are connected to I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, etc.; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, an optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 701 performs the respective methods and processes described above, such as a voice instruction response method. For example, in some embodiments, the voice instruction response method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 700 via ROM 702 and/or communication unit 709. When the computer program is loaded into RAM 703 and executed by computing unit 701, one or more steps of the voice instruction response method described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the voice instruction response method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present application may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual PRIVATE SERVER" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present application are achieved, and the present application is not limited herein.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (5)

1. The online identification and robust control method for the nonlinear electromechanical system is characterized by comprising the following steps of:
taking the controller as a participant of the minimized cost function, taking the system interference as a participant of the maximized cost function, and establishing a system description form based on two-person zero and differential game;
Introducing a neural network into a system description form based on two-person zero and differential game, and establishing a parameterized identification form of an uncertain model;
Identifying unknown parameters in the uncertain model by utilizing online transient data and offline historical data and adopting a parallel learning technology;
Based on the identified parameters, learning a solution of a differential countermeasure through a self-adaptive iterative algorithm for solving saddle points, and obtaining an optimal robust controller of the system under the worst interference condition;
The system description form based on the two-person zero and differential game is established, which comprises the following steps:
establishing a differential game model taking into account control inputs and system disturbances, formulated as:
Wherein, Is the system state,/>Is a control input,/>Is system interference,/>Is a system drift dynamics function,/>And/>Is a system input gain matrix,/>Is the initial moment of the differential game,Is the initial state of differential game;
to represent the maximum magnitude of control input and disturbance, a set of combinations is defined And/>Wherein, the method comprises the steps of, wherein,
Wherein,Is the control input/>Is interference/>
For the differential game model, the following non-quadratic cost function is adopted, and the expression is as follows:
Wherein the method comprises the steps of Is a finite time interval, state function/>And/>And/>Defined as the following non-negative integral function:
Wherein, And/>Are all integral arguments, matrix/>、/>、/>、/>Are positive definite matrices and:
based on the purpose of two-person zero and differential game, according to the collection And/>All/>And/>Determination of saddle Point/>So that the following inequality holds, formulated:
Wherein, Is a robust control input,/>Is the worst interference, saddle/>, according to the Pontrisia maximum and minimum principle, of the differential game modelThe following set of equations is satisfied:
Wherein, Is covariates in the Pontrian maximum and minimum principle,/>The Hamiltonian is defined as
The introducing the neural network to establish the parameterized identification form of the uncertain model comprises the following steps:
The unknown drift kinetic function of the system and the system input gain matrix are approximated by:
Wherein, ,/>,/>Respectively have/>, for the hidden layer,/>,/>Unknown weighting matrix of neural network of individual neurons,/>,/>,/>Is an activation function,/>,/>,/>Is an approximation error;
when the number of neurons approaches infinity and the approximation error converges to zero, the differential game model is converted according to the approximation expression, and the differential game model is obtained:
Wherein, To augment the weight matrix,/>To augment the activation function,/>Is the lumped estimation error, and the lumped estimation error/>The agreement converges to zero;
The identifying the unknown parameters in the uncertain model by using the online transient data and the offline historical data and adopting a parallel learning technology comprises the following steps:
Identifying the unknown augmented weight matrix using a data driven approach At time interval/>, of the cost functionIn the interior, a series of time points are selected, and the expression form is/>
Will beData points/>Is defined as the historical stack dataset/>Data point/>Is at the time point/>The set of system states, control inputs, system disturbances and state derivatives under, define the correlation matrix as/>During data acquisition, the incidence matrix needs to meet the condition/>I.e./>
System state and the first of the augmented weight matricesThe secondary estimates are denoted/>, respectivelyAnd/>Estimating a value/>, using the historical stack datasetAnd/>The adaptive update law of (a) is designed as follows:
Wherein, And/>Is a positive parameter,/>Is the system state/>Secondary estimation error,/>/>, For the augmented weight matrixSecondary estimation error;
the optimal robust controller of the system under the worst interference condition is obtained by learning the solution of the differential countermeasure through the self-adaptive iterative algorithm for solving the saddle point based on the identified parameters, and the optimal robust controller comprises the following components:
S1: selecting initial parameters Designating convergence accuracy within an engineering allowable range, acquiring an initial state estimated value and an initial augmentation weight matrix estimated value within a limited time interval, and calculating an initial cost, wherein the initial cost is formulated as follows:
Wherein, For the collection/>And/>Initial control inputs and system disturbances,/>For initial state estimation value,/>For initial weight matrix estimation value,/>At the initial cost,/>An initial measurement state of an original differential game model;
s2: regarding the control input as one of the differential games, iteratively updating the control input, formulated as:
Wherein, ,/>
S3: calculating a cost function after updating the control input, judging whether the updated control input reduces the cost function according to the principle that the control input needs to minimize the cost function, and formulating into:
Wherein, Is the original differential game model at/>Under measurement conditions, ifLet/>,/>Wherein/>And/>Is a positive iteration step and returns to S2, otherwise enter S4;
S4: regarding the system interference as the other party in the differential game, iteratively updating the system interference, and formulating as:
Wherein, ,/>
S5: updating the estimated value of the augmented weight matrix, and formulating as:
Wherein, ,/>,/>Is the original differential game model at/>A lower measurement state;
s6: calculating and updating a cost function after system interference, judging whether the updated system interference increases the cost function according to the principle that the system interference needs to be maximized, and formulating as:
If it is Let/>And/>Wherein/>And/>Is a positive iteration step and returns to S4, otherwise enter S7;
s7: judging whether the convergence of the control input meets the precision requirement, if so S8 is entered, otherwise, let/>,/>,/>,/>,/>And returns to S2;
s8: judging whether convergence of the estimated value of the augmentation weight matrix meets the precision requirement, if so Stopping the iterative process to obtain the optimal robust controller of the system under the worst interference condition, otherwise, setting/>,/>,/>,/>,/>And returning to S2, the iterative process continues.
2. An on-line identification and robust control device for a nonlinear electromechanical system, comprising:
the first establishing module is used for taking the controller as a participant of the minimized cost function, taking the system interference as a participant of the maximized cost function, and establishing a system description form based on two-person zero and differential game;
the second building module is used for introducing a neural network into the system description form based on the two-person zero and differential game and building a parameterized identification form of the uncertain model;
The identification module is used for identifying unknown parameters in the uncertain model by utilizing online instantaneous data and offline historical data and adopting a parallel learning technology;
the solving module is used for learning a solution of a differential countermeasure through a self-adaptive iterative algorithm for solving saddle points based on the identified parameters to obtain an optimal robust controller of the system under the worst interference condition;
The system description form based on the two-person zero and differential game is established, which comprises the following steps:
establishing a differential game model taking into account control inputs and system disturbances, formulated as:
Wherein, Is the system state,/>Is a control input,/>Is system interference,/>Is a system drift dynamics function,/>And/>Is a system input gain matrix,/>Is the initial moment of the differential game,Is the initial state of differential game;
to represent the maximum magnitude of control input and disturbance, a set of combinations is defined And/>Wherein, the method comprises the steps of, wherein,
Wherein,Is the control input/>Is interference/>
For the differential game model, the following non-quadratic cost function is adopted, and the expression is as follows:
Wherein the method comprises the steps of Is a finite time interval, state function/>And/>And/>Defined as the following non-negative integral function:
Wherein, And/>Are all integral arguments, matrix/>、/>、/>、/>Are positive definite matrices and:
based on the purpose of two-person zero and differential game, according to the collection And/>All/>And/>Determination of saddle Point/>So that the following inequality holds, formulated:
Wherein, Is a robust control input,/>Is the worst interference, saddle/>, according to the Pontrisia maximum and minimum principle, of the differential game modelThe following set of equations is satisfied:
Wherein, Is covariates in the Pontrian maximum and minimum principle,/>The Hamiltonian is defined as
The introducing the neural network to establish the parameterized identification form of the uncertain model comprises the following steps:
The unknown drift kinetic function of the system and the system input gain matrix are approximated by:
Wherein, ,/>,/>Respectively have/>, for the hidden layer,/>,/>Unknown weighting matrix of neural network of individual neurons,/>,/>,/>Is an activation function,/>,/>,/>Is an approximation error;
when the number of neurons approaches infinity and the approximation error converges to zero, the differential game model is converted according to the approximation expression, and the differential game model is obtained:
Wherein, To augment the weight matrix,/>To augment the activation function,/>Is the lumped estimation error, and the lumped estimation error/>The agreement converges to zero;
The identifying the unknown parameters in the uncertain model by using the online transient data and the offline historical data and adopting a parallel learning technology comprises the following steps:
Identifying the unknown augmented weight matrix using a data driven approach At time interval/>, of the cost functionIn the interior, a series of time points are selected, and the expression form is/>
Will beData points/>Is defined as the historical stack dataset/>Data point/>Is at the time point/>The set of system states, control inputs, system disturbances and state derivatives under, define the correlation matrix as/>During data acquisition, the incidence matrix needs to meet the condition/>I.e./>
System state and the first of the augmented weight matricesThe secondary estimates are denoted/>, respectivelyAnd/>Estimating a value/>, using the historical stack datasetAnd/>The adaptive update law of (a) is designed as follows:
Wherein, And/>Is a positive parameter,/>Is the system state/>Secondary estimation error,/>/>, For the augmented weight matrixSecondary estimation error;
the optimal robust controller of the system under the worst interference condition is obtained by learning the solution of the differential countermeasure through the self-adaptive iterative algorithm for solving the saddle point based on the identified parameters, and the optimal robust controller comprises the following components:
S1: selecting initial parameters Designating convergence accuracy within an engineering allowable range, acquiring an initial state estimated value and an initial augmentation weight matrix estimated value within a limited time interval, and calculating an initial cost, wherein the initial cost is formulated as follows:
Wherein, For the collection/>And/>Initial control inputs and system disturbances,/>For initial state estimation value,/>For initial weight matrix estimation value,/>At the initial cost,/>An initial measurement state of an original differential game model;
s2: regarding the control input as one of the differential games, iteratively updating the control input, formulated as:
Wherein, ,/>
S3: calculating a cost function after updating the control input, judging whether the updated control input reduces the cost function according to the principle that the control input needs to minimize the cost function, and formulating into:
Wherein, Is the original differential game model at/>Under measurement conditions, ifLet/>,/>Wherein/>And/>Is a positive iteration step and returns to S2, otherwise enter S4;
S4: regarding the system interference as the other party in the differential game, iteratively updating the system interference, and formulating as:
Wherein, ,/>
S5: updating the estimated value of the augmented weight matrix, and formulating as:
Wherein, ,/>,/>Is the original differential game model at/>A lower measurement state;
s6: calculating and updating a cost function after system interference, judging whether the updated system interference increases the cost function according to the principle that the system interference needs to be maximized, and formulating as:
If it is Let/>And/>Wherein/>And/>Is a positive iteration step and returns to S4, otherwise enter S7;
s7: judging whether the convergence of the control input meets the precision requirement, if so S8 is entered, otherwise, let/>,/>,/>,/>,/>And returns to S2;
s8: judging whether convergence of the estimated value of the augmentation weight matrix meets the precision requirement, if so Stopping the iterative process to obtain the optimal robust controller of the system under the worst interference condition, otherwise, setting/>,/>,/>,/>,/>And returning to S2, the iterative process continues.
3. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
The memory stores computer-executable instructions;
The processor executes computer-executable instructions stored in the memory to implement the method of claim 1.
4. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to implement the method of claim 1.
5. A computer program product comprising a computer program which, when executed by a processor, implements the method of claim 1.
CN202410372235.7A 2024-03-29 Nonlinear electromechanical system on-line identification and robust control method and device Active CN117970817B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410372235.7A CN117970817B (en) 2024-03-29 Nonlinear electromechanical system on-line identification and robust control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410372235.7A CN117970817B (en) 2024-03-29 Nonlinear electromechanical system on-line identification and robust control method and device

Publications (2)

Publication Number Publication Date
CN117970817A CN117970817A (en) 2024-05-03
CN117970817B true CN117970817B (en) 2024-06-21

Family

ID=

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111880412A (en) * 2020-08-12 2020-11-03 长春工业大学 Reconfigurable robot zero and neural optimal control method based on single evaluation network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111880412A (en) * 2020-08-12 2020-11-03 长春工业大学 Reconfigurable robot zero and neural optimal control method based on single evaluation network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Wang,Dezheng etc..Model-Free Finite-Horizon Zero-Sum Differential Games for Nonlinear Systems.2023 8th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC).2023,第152-157页. *

Similar Documents

Publication Publication Date Title
WO2019034129A1 (en) Neural network structure generation method and device, electronic equipment and storage medium
CN114065863B (en) Federal learning method, apparatus, system, electronic device and storage medium
CN112580733B (en) Classification model training method, device, equipment and storage medium
CN114202076B (en) Training method of deep learning model, natural language processing method and device
CN108496188A (en) Method, apparatus, computer system and the movable equipment of neural metwork training
CN113792851A (en) Font generation model training method, font library establishing method, device and equipment
CN113850904A (en) Method and device for determining hair model, electronic equipment and readable storage medium
US20230139187A1 (en) Method and apparatus for determining information, electronic device and storage medium
CN115456167A (en) Lightweight model training method, image processing device and electronic equipment
CN114494747A (en) Model training method, image processing method, device, electronic device and medium
CN117970817B (en) Nonlinear electromechanical system on-line identification and robust control method and device
CN113657468A (en) Pre-training model generation method and device, electronic equipment and storage medium
CN116945177A (en) Feeding robot feeding path planning method, feeding robot feeding path planning device, electronic equipment and medium
CN117970817A (en) Nonlinear electromechanical system on-line identification and robust control method and device
CN116176531A (en) Method and device for determining performance index of opening degree adjustment and storage medium
CN114274148B (en) Track planning method and device, electronic equipment and storage medium
CN116050030A (en) Method, device and equipment for determining axial center position of blower rotor
CN114882587A (en) Method, apparatus, electronic device, and medium for generating countermeasure sample
CN114620074A (en) Vehicle control method, device, electronic device and storage medium
CN113792473A (en) Modeling and using method of unmanned aerial vehicle dynamic network prediction model and related equipment
CN115056847B (en) Calculation method, control method and device for zero offset compensation angle of steering wheel of vehicle
CN115598967B (en) Parameter setting model training, parameter determining method, device, equipment and medium
CN115009278B (en) Cruise control method, device, equipment and storage medium
CN111460732B (en) Construction method of nonlinear model of planar motor
CN116679573B (en) Consistency tracking control method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant