US20170271984A1

US20170271984A1 - Using battery dc characteristics to control power output

Info

Publication number: US20170271984A1
Application number: US15/096,091
Authority: US
Inventors: Wolf Kohn; Vishnu Vettrivel; Jonathan Cross; Pengbo Zhang; Michael Luis Sandoval; Brian Schaper; Neel Master; Brandon Weiss; David Kettler
Original assignee: Atigeo LLC
Current assignee: Veritone Alpha Inc
Priority date: 2016-03-04
Filing date: 2016-04-11
Publication date: 2017-09-21
Also published as: KR20180123075A; EP3424139B1; US10601316B2; CN109478848B; EP3424139A1; KR102019140B1; EP3424139A4; US20190044440A1; JP6584040B2; CN109478848A; JP2019513000A; WO2017152180A1

Abstract

Techniques are described for implementing automated control systems to control operations of specified physical target systems, such as with one or more batteries used to store and provide electrical power. Characteristics of each battery's state may be used to perform automated control of DC power from the battery, such as in a real-time manner and to optimize long-term operation of the battery. In some situations, multiple batteries are controlled by using multiple control systems each associated with one of the batteries, and with overall control being coordinated in a distributed manner using interactions between the multiple control systems. A system that includes one or more batteries to be controlled may further include additional components in some situations, such as one or more electrical sources and/or one or more electrical loads, with one non-exclusive example of a type of such system being one or more home electrical power systems.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Patent Application No. 62/304,034, filed Mar. 4, 2016 and entitled “Using Battery DC Characteristics To Control Power Quality,” which is hereby incorporated by reference in its entirety.

BACKGROUND

Various attempts have been made to implement automated control systems for various types of physical systems having inputs or other control elements that the control system can manipulate to attempt to provide desired output or other behavior of the physical systems being controlled. Such automated control systems have used various types of architectures and underlying computing technologies to attempt to implement such functionality, including to attempt to deal with issues related to uncertainty in the state of the physical system being controlled, the need to make control decisions in very short amounts of time and with only partial information, etc. One example of such an automated control system includes a system for controlling operations of a battery that is discharging electrical power to support a load and/or is charging using electrical power from a source, with uncertainty about an internal temperature and/or chemical state of the battery, and potentially with ongoing changes in load, source and/or battery internal state.
However, various difficulties exist with existing automated control systems and their underlying architectures and computing technologies, including with respect to managing large numbers of constraints (sometimes conflicting), operating in a coordinated manner with other systems, etc. Particular difficulties can arise when attempting to control one or more batteries in situations in which multiple conflicting constraints and/or goals exist.

BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.

FIG. 1 is a network diagram illustrating an example environment in which a system for performing cooperative distributed control of target systems may be configured and initiated.

FIG. 2 is a network diagram illustrating an example environment in which a system for performing cooperative distributed control of target systems may be implemented.

FIG. 3 is a block diagram illustrating example computing systems suitable for executing an embodiment of a system for performing cooperative distributed control of target systems in configured manners.

FIG. 4 illustrates a flow diagram of an example embodiment of a Collaborative Distributed Decision (CDD) System routine.

FIGS. 5A-5B illustrate a flow diagram of an example embodiment of a CDD Decision Module Construction routine.

FIGS. 6A-6B illustrate a flow diagram of an example embodiment of a decision module routine.

FIGS. 7A-7B illustrate a flow diagram of an example embodiment of a CDD Control Action Determination routine.

FIGS. 8A-8B illustrate a flow diagram of an example embodiment of a CDD Coordinated Control Management routine.

FIG. 9 illustrates a flow diagram of an example embodiment of a routine for a target system being controlled.

FIG. 10 illustrates a network diagram of a portion of a distributed architecture of an example CDI system.

FIG. 11 illustrates a network diagram of a portion of an example Rules Builder component.

FIG. 12 illustrates a network diagram of example sub-components of an example Rules Builder component.

FIG. 13 illustrates a network diagram of interactions of portions of a Rules Builder component with an executing agent.

FIG. 14 illustrates a network diagram of interactions of portions of a Rules Builder component with chattering components.

FIG. 15 illustrates a diagram of a sliding window for use with chattering.

FIG. 16 illustrates a diagram of using Lebesgue integration to approximate a control trajectory for chattering.

FIG. 17 illustrates a diagram of various interactions of different portions of a CDI system.

FIG. 18 illustrates a diagram of example different types of rules.

FIG. 19 illustrates an example user interface related to medical/clinical auto-coding.

FIG. 20 illustrates a diagram of some components of a CDI system.

FIG. 21 illustrates a diagram of performing Pareto processing for use with mean field techniques.

FIG. 22 illustrates a network diagram of an example decision module agent.

FIG. 23 illustrates a network diagram of an example of offline workflows for knowledge capture.

FIG. 24 illustrates a network diagram of an example of workflows for mean field computation and Pareto Optimal.

FIG. 25 illustrates a network diagram of an example of an automated control system for a home solar micro-grid electrical generating system.

FIG. 26 illustrates a diagram of workflow and components of a portion of a CDI system.

FIG. 27 illustrates a diagram of a workflow for an inference process portion of a CDI system.

FIG. 28 illustrates a diagram of an overview workflow for a portion of a CDI system.

FIGS. 29A-29K illustrate examples of using a CDI system to iteratively determine near-optimal solutions over time for controlling a target system.

FIG. 30 is a block diagram illustrating example components of an embodiment of a system for using characteristics of a battery's state to perform automated control of DC power from the battery.

FIG. 31 is a block diagram illustrating example components of an embodiment of a system that performs automated control of DC power from multiple batteries in a coordinated manner.

FIG. 32 is a block diagram illustrating example components of an embodiment of a system for performing automated control of DC power from a battery that is part of a home electrical power system with solar power being generated, with the home power generation and use being monitored and synchronized by an external entity.

FIG. 33 is a block diagram illustrating example components of an embodiment of a system for performing automated control of DC power from a battery that is part of a home electrical power system with solar power being generated, in which typical monitoring and synchronization of the home power generation and use is temporarily interrupted.

FIG. 34 is a block diagram illustrating example components of an embodiment of a system for modeling a battery's state and operation to enable automated control of DC power from the battery.

FIG. 35 is a block diagram illustrating example visual displays of performance of an embodiment of a system that is using characteristics of a battery's state to perform automated control of DC power from the battery, with the home power generation and use being monitored and synchronized by an external entity.

FIG. 36 is a block diagram illustrating example visual displays of performance of an embodiment of a system that is using characteristics of a battery's state to perform automated control of DC power from the battery, in which typical monitoring and synchronization of the home power generation and use is temporarily interrupted.

FIG. 37 is a block diagram illustrating example components of an embodiment of a system for using characteristics of a battery's state to perform automated control of DC power from the battery.

FIG. 38 is a block diagram illustrating example components of an embodiment of a system for using characteristics of a battery's state to perform automated control of DC power from the battery.

FIG. 39 is a block diagram illustrating example components of an embodiment of a system for using characteristics of a battery's state to perform automated control of DC power from the battery.

FIG. 40 is a block diagram illustrating example components of an embodiment of a system that performs automated control of DC power from multiple batteries in a coordinated manner.

FIG. 41 is a block diagram illustrating an example environment in which a system for performing cooperative distributed control of target systems may be implemented.

FIG. 42 is a block diagram illustrating example components of an embodiment of a system that performs automated control of DC power from multiple batteries in a coordinated manner.

FIG. 43 is a block diagram illustrating example components of an embodiment of a system that performs automated control of DC power from multiple batteries in a coordinated manner.

FIG. 44 is a block diagram illustrating example components of an embodiment of a system that performs automated control of DC power from multiple batteries in a coordinated manner.

FIG. 45 is a block diagram illustrating example components of an embodiment of a system for using characteristics of a battery's state to perform automated control of DC power from the battery, as well as to use price forecasts and additional information to enhance financial performance of the system.

FIG. 46 is a block diagram illustrating example computing systems suitable for executing an embodiment of a system for performing cooperative distributed control of target systems in configured manners.

FIG. 47 is a block diagram illustrating an example architecture of a battery controller component.

FIG. 48 is a block diagram illustrating an example feedback controller.

FIG. 49 is a block diagram illustrating an example architecture for training a battery model for a battery.

FIG. 50 is a block diagram illustrating an example architecture for computing feedback gain for a tracking controlling system for a battery.

FIG. 51 is a block diagram illustrating an example architecture for a battery control system.

DETAILED DESCRIPTION

Techniques are described for implementing automated control systems to control or otherwise manipulate at least some operations of specified physical systems or other target systems. In at least some embodiments, the physical target systems include one or more batteries used to store and provide electrical power, and the automated operations to control the target systems include using characteristics of each battery's state to perform automated control of DC (direct current) power that is provided from the battery, such as by using a DC-to-DC amplifier (e.g., a FET, or field-effect transistor, amplifier) connected to the battery to control an amount of electrical current and/or voltage being output from the battery, in a real-time manner and to optimize long-term operation of the battery. The DC-to-DC amplifier may, for example, be part of a buck converter (or step-down converter) that steps down voltage while stepping up current from its input (supply) to its output (bad) and/or be part of a boost converter (or step-up converter) that steps up voltage while stepping down current from its input (supply) to its output (load), referred to generally at times herein as a “boost/buck controller” or “buck/boost controller”. In addition, in some embodiments and situations, multiple batteries may be controlled in such a manner by using multiple control systems that are each associated with one of the batteries, and with the overall control of the multiple batteries being coordinated in a distributed manner, such as based on interactions between the multiple associated control systems for the multiple batteries. A system that includes one or more batteries to be controlled may further include additional components in some embodiments and situations, such as one or more electrical sources and/or one or more electrical loads, with one non-exclusive example of such a type of system being one or more home or business electrical power systems that may optionally include electrical generation sources (e.g., solar panels, wind turbines, etc.) as well as electrical load from the house or business. Additional details are included below related to such techniques for implementing and using automated control systems to control target systems with one or more batteries.
The described techniques may provide a variety of benefits and advantages. Non-exclusive examples of such benefits and advantages include controlling power output of a battery on the DC side (e.g., by varying DC current and voltage of the battery), allowing the battery to operate in its optimal or near-optimal physical state with respect to internal temperature and/or other battery parameters (e.g., by maintaining an internal chemistry of the battery within one or more chemical stoichiometric limits, such as a stoichiometric range), rather than fixing AC (alternating current) voltage and/or current being provided by an inverter connected to the battery at a single specified level, and forcing the battery to operate at a saturation level that provides maximum power but at the cost of possible non-reversible battery damage—in this manner, battery longevity and other operational performance may be optimized or otherwise enhanced by the described techniques, by allowing the battery to operate in a preferred range in which battery chemistry changes are reversible and in which short-circuits and other non-reversible damage is reduced or eliminated. In addition, in at least some embodiments, the automated control of the battery may further include active control of the battery to enhance and maintain power output amount resonance with the other components (e.g., a load and/or an external power grid), such that the amount of power provided does not exceed what is needed, while also satisfying at least a defined percentage or other amount of power output requests or load amounts (e.g., 50%, 65%, 100% or any other defined percentage or other amount). In this manner, such embodiments may be conceptualized as an automated control system to manage the internal state and operational performance (including longevity) of a battery being controlled, while satisfying power output requests if possible, rather than existing systems that fix the output voltage or current of a battery and fulfill all power requests even if it causes battery damage or other problems (e.g., excessive heating). While the benefits and operations discussed above and in some other locations herein relate to controlling power being output from a battery, it will be appreciated that the same techniques may be used to control power being stored into a battery from one or more sources, so as to cause the battery to operate in its optimal or near-optimal physical state with respect to heat and other battery parameters while storing power, and to optimize or otherwise enhance battery longevity and other operational performance by allowing the battery to operate in a preferred range in which battery chemistry changes are reversible, while satisfying at least a defined percentage or other amount of power input requests (e.g., 50%, 65%, 100% or any other defined percentage or other amount) for power being supplied by one or more sources. Additional benefits and advantages include the following, with the term ‘optimizing’ a feature or result as used herein meaning generally improving that feature or result (e.g., via partial or full optimization), and with the term ‘real-time’ as used herein meaning with respect to a time frame (e.g., fractions of a second, seconds, minutes, etc.) specific to a resulting component or system being controlled, unless otherwise indicated:

- can improve battery lifetime by optimizing DC control variables, such as I (current), V (voltage) and R (amount of power being output)
- can improve battery lifetime by optimizing DC control variables (I, V, R) in conjunction with a prior characterization of battery chemistry, and can optimize at DC level to improve performance and longevity
- can optimize variables in real-time in DC domain to solve for objectives in AC phase
- can optimize AC output in real-time to match grid frequency resulting in resonant operation, such as via control of only battery output and no other grid components
- can improve charge/discharge cycles to improve long-term battery availability
- can improve AC load response
- can improve AC load response in combination with improving long-term battery availability
- battery controller can run as embedded software on a processor in a self-sufficient manner
- battery controller can be monitored and updated continuously from external location (e.g., the cloud or other network-accessible location)
- battery controller can transmit battery characteristics to improve performance
- can avoid expenses of static bar compensator hardware
  Various other benefits and advantages may be further realized in at least some embodiments, as discussed in part in greater detail below.

FIG. 30 includes a block diagram 3000 illustrating example components of an embodiment of a system for using characteristics of a battery's state along with other related information to perform automated control of DC power from the battery—in particular, various components of example system 3000 interact to control operations of the battery according to one or more defined goals in light of defined constraints, rules and other criteria, as discussed further below. In some embodiments, the automated activities to control the battery may be performed in a real-time manner and/or to optimize long-term operation of the battery (e.g., the life of the battery), while satisfying as many external requests for power (e.g., from a utility to which the battery can supply power) as is possible (e.g., at least a defined percentage or quantity of such requests).
In the illustrated example of FIG. 3000, a battery 3010 is shown that is being controlled via an actuator 3030 receiving a corresponding control signal from a battery controller component 3040 (also referred to as a “tracking controller” and/or “battery tracking controller” at times herein), such as by the battery controller specifying an amount of power to be generated as DC output of the battery. The specified power amount to be generated may include information indicating, for example, to increase or decrease the power being output by a specified amount, or to not change the power output. While not illustrated here, the output of the battery may serve to provide power to one or more loads (not shown), and in at least some embodiments may be connected to an inverter/rectifier component to convert the power output of the battery to AC power to support corresponding loads—such an inverter may, for example, control power being provided from the battery by regulating voltage and/or frequency of the AC power. Similarly, while also not illustrated here, input of the battery may serve to receive power from one or more sources (not shown), and in at least some embodiments may be connected to an inverter/rectifier component to convert AC power input from the sources to DC power for the battery—such a rectifier may, for example, control power being provided to the battery by regulating voltage and/or frequency of the AC power.
As part of determining how to control the battery, the battery controller component receives input from a sensor module 3020 regarding an internal state (not shown) of the battery, such as current values for voltage, electrical current, temperature, etc., and supplies corresponding information to a CDI agent 3050. The CDI agent, which is also referred to at times herein as a CDD (Collaborative Distributed Decision) decision module or system, receives the information from the battery controller related to the state of the battery, and also receives power supply requests from a utility component 3060, such as in a situation in which the battery supplies power at some or all times to an electrical grid (not shown) controlled by the utility. In particular, the CDI agent receives a particular request from the utility, receives and analyzes information about the state of the battery, and determines corresponding operations to take at the current time for the battery (e.g., an amount of output power to be supplied from the battery, and/or an amount of input power to be received and stored by the battery), which in at least some situations involve attempting to fully or partially satisfy the request from the utility for power in a real-time manner if the request can be satisfied in a way that also satisfies other constraints on the battery performance given the current state of the battery and the defined goal(s), such as to enable the battery to operate in a desired non-saturation range or level (e.g., with respect to an estimated internal temperature of the battery and/or estimated internal chemistry of the battery). After determining the corresponding operations to take at the current time for the battery, the CDI agent provides a corresponding tracking control signal to the battery controller, which determines how to currently modify or manipulate the actuator to effectuate the corresponding operations for the tracking control signal (e.g., an amount of positive or negative change to make in an amount of current being output from the battery), and sends a corresponding control signal to the actuator as discussed above.
While not illustrated in FIG. 30, the CDI Agent and/or battery controller may in some embodiments include a stored model of the battery that is used to estimate internal state of the battery and to select particular operations to perform based in part on that internal state. For example, in some embodiments a generic battery model may be used that is applicable to any type of battery, while in other embodiments a battery model may be used that is specific to a type of the battery (e.g., a type of chemical reaction used to store and/or generate electricity, such as lithium ion or nickel cadmium), while in yet other embodiments a battery model may be used that is designed and/or configured specifically for the particular battery in use. In addition, in at least some embodiments, a battery model that is initially employed in a particular system with a particular battery may be updated over time, such as to reflect improvements to the underlying structure of the model and/or to train the model to reflect operational characteristics specific to the particular battery and/or system in use (e.g., by monitoring how changes in observable battery state correlate to corresponding external battery electrical load and/or electrical source)—when training or otherwise adapting a model to a particular battery and/or system, the training/adaption operations may in some embodiments be performed initially in a training phase before using the automated control system to control the battery, and/or in some embodiments may be performed continuously or periodically while the automated control system is controlling the battery (e.g., to reflect changes over time in an impedance profile of the battery). Additional details are included elsewhere herein regarding such models, including their construction and use. In addition, while in some embodiments the battery controller and CDI agent may be implemented as separate components (e.g., with the battery controller implemented in whole or in part in hardware and/or firmware that is attached to the battery or otherwise at a location of the battery, and with the CDI agent implemented in part by software instructions executing on one or more computing systems remote from the battery location and optionally communicating with the battery controller over one or more intervening computer networks), in other embodiments the CDI agent and battery controller may be implemented as a single component (whether at the location of the battery or remote from it). Further details regarding operation of the CDI agent to determine operations to take for the battery are discussed in greater detail below.
In addition, while not illustrated with respect to FIG. 30, in some embodiments multiple batteries (e.g., tens, hundreds, thousands, millions, etc.) may each have an associated CDI agent that controls actions of that battery in a similar manner, and with the various batteries acting together in a coordinated manner to supply aggregate power to the utility or to other entities. In such embodiments, the utility or other external entity may send synchronization and monitoring signals for use by the various systems including the batteries, and the multiple CDI agents associated with the various batteries may interact to exchange information and maintain at least partial coordination between the operations of the batteries.
FIG. 31 is a block diagram illustrating one example of components of an embodiment of a system 3100 that performs automated control of DC power from multiple batteries in a coordinated manner, such as in a real-time manner and to optimize long-term operation of the batteries. In particular, multiple CDI agents 3050 a-n are illustrated, which are each controlling operation of one of n associated batteries (not shown). In the example of FIG. 31, each CDI agent receives not only battery state information for its associated battery, but also may optionally receive additional information that includes requests for power from the utility, status information related to the utility, and/or predicted price information for power that is supplied to an electrical grid (not shown) of the utility from the batteries. Each CDI agent further includes rules and an optimizer component in this example, with the rules specifying the goals and constraints to satisfy for the associated battery, and the optimizer component using those rules and other input information to make corresponding automated control decisions for the associated battery, which are output by the CDI agent as target information for the associated battery. Additional details are included herein related to specification and use of such rules and operations of such an optimizer component. FIG. 31 further illustrates a combiner component 3110 that combines information from the various CDI agents to determine an aggregate response to the utility's requests, corresponding to the aggregate power being provided by the batteries being controlled.
It will also be appreciated that the described techniques may be used with a wide variety of other types of target systems, some of which are further discussed below, and that the invention is not limited to the techniques discussed for particular target systems and corresponding automated control systems. For illustrative purposes, some embodiments are described below in which specific types of operations are performed, including with respect to using the described techniques with particular types of target systems and to perform particular types of control activities that are determined in particular manners. These examples are provided for illustrative purposes and are simplified for the sake of brevity, and the inventive techniques may be used in a wide variety of other situations, including in other environments and with other types of automated control action determination techniques, some of which are discussed below.
More generally, a target system to be controlled or otherwise manipulated may have numerous elements that are inter-connected in various manners, with a subset of those elements being inputs or other control elements that a corresponding automated control system may modify or otherwise manipulate in order to affect the operation of the target system. In at least some embodiments and situations, a target system may further have one or more outputs that the manipulations of the control elements affect, such as if the target system is producing or modifying physical goods or otherwise producing physical effects. For example, output of a target system involving automated control of one or more batteries may include electrical power being provided by the batteries, and inputs or other control elements may include the actuator(s) and/or battery tracking controller(s) used to manipulate the power being provided from the one or more batteries.
As part of implementing such an automated control system for a particular target system, an embodiment of a Collaborative Distributed Decision (CDD) system may use the described techniques to perform various automated activities involved in constructing and implementing the automated control system, including one or more CDI agents (also referred to as a CDD decision module or system, or a portion of such module or system) for use in controlling particular target systems. Some aspects of such activities of an example CDD system is provided below, with additional details included in U.S. patent application Ser. No. 14/746,738, filed Jun. 22, 2015 and entitled “Cooperative Distributed Control Of Target Systems;” in U.S. Patent Application No. 62/182,968, filed Jun. 22, 2015 and entitled “Applications Of Cooperative Distributed Control Of Target Systems;” in U.S. Patent Application No. 62/182,796, filed Jun. 22, 2015 and entitled “Gauge Systems;” and in international PCT Patent Application No. PCT/US2015/037022, filed Jun. 22, 2015 and entitled “Cooperative Distributed Control Of Target Systems,” each of which is hereby incorporated by reference in its entirety.
In particular, the CDD system may in some embodiments implement a Decision Module Construction component that interacts with one or more users to obtain a description of a target system, including restrictions related to the various elements of the target system, and one or more goals to be achieved during control of the target system—the Decision Module Construction component then performs various automated actions to generate, test and deploy one or more executable decision modules (also referred to at times as “decision elements” and/or “agents”) to use in performing the control of the target system. When the one or more executable decision modules are deployed and executed, the CDD system may further provide various components within or external to the decision modules being executed to manage their control of the target system, such as a Control Action Determination component of each decision module to optimize or otherwise enhance the control actions that the decision module generates, and/or one or more Coordinated Control Management components to coordinate the control actions of multiple decision modules that are collectively performing the control of the target system.
As noted above, a Collaborative Distributed Decision (CDD) system may in some embodiments use at least some of the described techniques to perform various automated activities involved in constructing and implementing a automated control system for a specified target system, such as to modify or otherwise manipulate inputs or other control elements of the target system that affect its operation (e.g., affect one or more outputs of the target system). An automated control system for such a target system may in some situations have a distributed architecture that provides cooperative distributed control of the target system, such as with multiple decision modules that each control a portion of the target system and that operate in a partially decoupled manner with respect to each other. If so, the various decision modules' operations for the automated control system may be at least partially synchronized, such as by each reaching a consensus with one or more other decision modules at one or more times, even if a fully synchronized convergence of all decision modules at all times is not guaranteed or achieved.
The CDD system may in some embodiments implement a Decision Module Construction component that interacts with one or more users to obtain a description of a target system, including restrictions related to the various elements of the target system, and one or more goals to be achieved during control of the target system the Decision Module Construction component then performs various automated actions to generate, test and deploy one or more executable decision modules to use in performing the control of the target system. The Decision Module Construction component may thus operate as part of a configuration or setup phase that occurs before a later run-time phase in which the generated decision modules are executed to perform control of the target system, although in some embodiments and situations the Decision Module Construction component may be further used after an initial deployment to improve or extend or otherwise modify an automated control system that has one or more decision modules (e.g., while the automated control system continues to be used to control the target system), such as to add, remove or modify decision modules for the automated control system.
In some embodiments, some or all automated control systems that are generated and deployed may further provide various components within them for execution during the runtime operation of the automated control system, such as by including such components within decision modules in some embodiments and situations. Such components may include, for example, a Control Action Determination component of each decision module (or of some decision modules) to optimize or otherwise determine and improve the control actions that the decision module generates. For example, such a Control Action Determination component in a decision module may in some embodiments attempt to automatically determine the decision module's control actions for a particular time to reflect a near-optimal solution with respect to or one more goals and in light of a model of the decision module for the target system that has multiple inter-related constraints—if so, such a near-optimal solution may be based at least in part on a partially optimized solution that is within a threshold amount of a fully optimized solution. Such determination of one or more control actions to perform may occur for a particular time and for each of one or more decision modules, as well as be repeated over multiple times for ongoing control by at least some decision modules in some situations. In some embodiments, the model for a decision module is implemented as a Hamiltonian function that reflects a set of coupled differential equations based in part on constraints representing at least part of the target system, such as to allow the model and its Hamiltonian function implementation to be updated over multiple time periods by adding additional expressions within the evolving Hamiltonian function.
In some embodiments, the components included within a generated and deployed automated control system for execution during the automated control system's runtime operation may further include one or more Coordinated Control Management components to coordinate the control actions of multiple decision modules that are collectively performing the control of a target system for the automated control system. For example, some or all decision modules may each include such a Control Action Determination component in some embodiments to attempt to synchronize that decision module's local solutions and proposed control actions with those of one or more other decision modules in the automated control system, such as by determining a consensus shared model with those other decision modules that simultaneously provides solutions from the decision module's local model and the models of the one or more other decision modules. Such inter-module synchronizations may occur repeatedly to determine one or more control actions for each decision module at a particular time, as well as to be repeated over multiple times for ongoing control. In addition, each decision module's model is implemented in some embodiments as a Hamiltonian function that reflects a set of coupled differential equations based in part on constraints representing at least part of the target system, such as to allow each decision module's model and its Hamiltonian function implementation to be combined with the models of one or more other decision modules by adding additional expressions for those other decision modules' models within the initial Hamiltonian function for the local model of the decision module.
As noted above, the described techniques may be used to provide automated control systems for various types of physical systems or other target systems. In one or more embodiments, an automated control system is generated and provided and used to control a micro-grid electricity facility, such as at a residential location that includes one or more electricity sources (e.g., one or more solar panel grids, one or more wind turbines, etc.) and one or more electricity storage and source mechanisms (e.g., one or more batteries). The automated control system may, for example, operate at the micro-grid electricity facility (e.g., as part of a home automation system), such as to receive requests from the operator of a local electrical grid to provide particular amounts of electricity at particular times, and to control operation of the micro-grid electricity facility by determining whether to accept each such request. If a request is accepted, the control actions may further include selecting which electricity source (e.g., solar panel, battery, etc.) to use to provide the requested electricity, and otherwise the control actions may further include determine to provide electricity being generated to at least one energy storage mechanism (e.g., to charge a battery). Outputs of such a physical system include the electricity being provided to the local electrical grid, and a goal that the automated control system implements may be, for example, is to maximize profits for the micro-grid electricity facility from providing of the electricity.
In one or more embodiments, an automated control system is generated and provided and used to control a vehicle with a battery, a motor and in some cases an engine, such as an electrical bicycle in which power may come from a user who is pedaling and/or from a motor powered by the battery and/or the engine. The automated control system may, for example, operate on the vehicle or on the user, such as to control operation of the vehicle by determining whether at a current time to remove energy from the battery to power the motor (and if so to further determine how much energy to remove from the battery) or to instead add excess energy to the battery (e.g., as generated by the engine, and if so to further determine how much energy to generate from the engine; and/or as captured from braking or downhill coasting). Outputs of such a physical system include the effects of the motor to move the vehicle, and a goal that the automated control system implements may be, for example, to move the vehicle at one or more specified speeds with a minimum of energy produced from the battery, and/or to minimize use of fuel by the engine.
It will be appreciated that batteries may be used in a wide variety of other situations and may similarly be controlled by embodiments of the described techniques, such as with solar panels and other photovoltaic systems, electrical cars and other vehicles, etc.
Use of the described techniques may also provide various types of benefits in particular embodiments, including non-exclusive examples of beneficial attributes or operations as follows:

- infer interests/desired content in a cold start environment where textual (or other unstructured) data is available and with minimal user history;
- improve inference in a continuous way that can incorporate increasingly rich user histories;
- improve inference performance with the addition of feedback, explicit/implicit, positive/negative and preferably in a real-time or near-real-time manner;
- derive information from domain experts that provide business value, and embed them in inference framework;
- dynamically add new unstructured data that may represent new states, and update existing model in a calibrated way;
- renormalize inference system to accommodate conflicts;
- immediately do inferencing in a new environment based on a natural language model;
- add new information as a statistical model, and integrate with a natural language model to significantly improve inference/prediction;
- integrate new data and disintegrate old data in a way that only improves performance;
- perform inferencing in a data secure way;
- integrate distinct inferencing elements in a distributed network and improve overall performance;
- easily program rules and information into the system from a lay-user perspective;
- inexpensively perform computer inferences in a way that is suitable for bandwidth of mobile devices; and
- incorporate constraint information.
  It will be appreciated that some embodiments may not include all some illustrative benefits, and that some embodiments may include some benefits that are not listed.

FIG. 41 is a block diagram illustrating an example environment in which a system 4100 for performing cooperative distributed control of target systems may be implemented, such as to control physical target systems having one or more batteries by using characteristics of each battery's state to perform automated control of DC power that is provided from the battery (e.g., in a real-time manner and to optimize long-term operation of the battery), such as in the matter discussed with respect to FIGS. 30-40 and 42-51 and elsewhere herein. In particular, the system 4100 is analogous in some respects to system 3100 of FIG. 31, such as the CDD decision modules 4124 of FIG. 41 (or their CDD Control Action Determination components 4144 and/or CDD Coordinated Control Management components 4146) correspond to the CDI agents 3050 of FIG. 31, and if the target system of FIG. 41 include batteries to be controlled (e.g., with respect to control elements 4161), but with additional details in FIG. 41 regarding operation of the CDD decision modules.
In particular, in the example environment of FIG. 41, target system 1 4160 is illustrated, with the automated control system 4122 being deployed and implemented to use in actively controlling the target system 1 4160. It will be appreciated that such target system may include a variety of mechanical, electronic, chemical, biological, and/or other types of components to implement operations of the target system in a manner specific to the target system. In the example of FIG. 41, the decision modules 4124 are represented as individual decision modules 4124 a, 4124 b, etc., to 4124 n, and may be executing locally to the target system 1 4160 and/or in a remote manner over one or more intervening computer networks (not shown). In the illustrated example, each of the decision modules 4124 includes a local copy of a CDD Control Action Determination component 4144, such as with component 4144 a supporting its local decision module 4124 a, component 4144 b supporting its local decision module 4124 b, and component 4144 n supporting its local decision module 4124 n. Similarly, the actions of the various decision modules 4124 are coordinated and synchronized in a peer-to-peer manner in the illustrated embodiment, with each of the decision modules 4124 including a copy of a CDD Coordinated Control Management component 4146 to perform such synchronization, with component 4146 a supporting its local decision module 4124 a, component 4146 b supporting its local decision module 4124 b, and component 4146 n supporting its local decision module 4124 n.
As the decision modules 4124 and automated control system 4122 execute, various interactions 4175 between the decision modules 4124 are performed, such as to share information about current models and other state of the decision modules to enable cooperation and coordination between various decision modules, such as for a particular decision module to operate in a partially synchronized consensus manner with respect to one or more other decision modules (and in some situations in a fully synchronized manner in which the consensus actions of all of the decision modules 4124 converge). During operation of the decision modules 4124 and automated control system 4122, various state information 4143 may be obtained by the automated control system 4122 from the target system 4160, such as initial state information and changing state information over time (e.g., with respect to batteries, not shown, corresponding to control elements 4161 of target system 1), and including outputs or other results in the target system 1 from control actions performed by the decision modules 4124.
The target system 1 in this example includes various control elements 4161 (e.g., batteries and their power output) that the automated control system 4122 may manipulate, and in this example each decision module 4124 may have a separate group of one or more control elements 4161 that it manipulates (such that decision module A 4124 a performs interactions 4169 a to perform control actions A 4147 a on control elements A 4161 a, decision module B 4124 b performs interactions 4169 b to perform control actions B 4147 b on control elements B 4161 b, and decision module N 4124 n performs interactions 4169 n to perform control actions N 4147 n on control elements N 4161 n). Such control actions affect the internal state 4163 of other elements of the target system 1, including optionally to cause or influence one or more outputs 4162 (e.g., aggregate electrical power being produced from the multiple batteries). As discussed in greater detail elsewhere herein, control element 4161 a may, for example, be a FET actuator connected to a particular first battery (not shown) of target system 1 that is being controlled, and other control elements 4161 b-n may similarly be other FET actuators connected to other batteries (not shown) of target system 1. As operation of the target system 1 is ongoing, at least some of the internal state information 4163 is provided to some or all of the decision modules to influence their ongoing control actions, with each of the decision modules 4124 a-124 n possibly having a distinct set of state information 4143 a-143 n, respectively, in this example.
As discussed in greater detail elsewhere, each decision module 4124 may use such state information 4143 and a local model 4145 of the decision module for the target system to determine particular control actions 4147 to next perform, such as for each of multiple time periods, although in other embodiments and situations, a particular automated control system may perform interactions with a particular target system for only one time period or only for some time periods. For example, the local CDD Control Action Determination component 4144 for a decision module 4124 may determine a near-optimal location solution for that decision module's local model 4145, and with the local CDD Coordinated Control Management component 4146 determining a synchronized consensus solution to reflect other of the decision modules 4124, including to update the decision module's local model 4145 based on such local and/or synchronized solutions that are determined. Thus, during execution of the automated control system 4122, the automated control system performs various interactions with the target system 4160, including to request state information, and to provide instructions to modify values of or otherwise manipulate control elements 4161 of the target system 4160. For example, for each of multiple time periods, decision module 4124 a may perform one or more interactions 4169 a with one or more control elements 4161 a of the target system, while decision module 4124 b may similarly perform one or more interactions 4169 b with one or more separate control elements B 4161 b, and decision module 4124 n may perform one or more interactions 4169 n with one or more control elements N 4161 n of the target system 4160. In other embodiments and situations, at least some control elements may not perform control actions during each time period.
In other embodiments and situations (e.g., if only a single battery is being controlled), the deployed copy of the automated control system may include only a single executing decision module, such as to include a local CDD Control Action Determination component but to not include any local CDD Coordinated Control Management component (since there are not other decision modules with which to synchronize and interact).
While not illustrated in FIG. 41, the distributed nature of operations of automated control systems such as those of 4122 allow partially decoupled operations of the various decision modules, include to allow modifications to the group of decision modules 4124 to be modified over time while the automated control system 4122 is in use, such as to add new decision modules 4124 and/or to remove existing decision modules 4124 (e.g., to reflect changes to underlying batteries in use, such as in different home power systems). In a similar manner, changes may be made to particular decision modules 4124 and/or 4128, such as to change rules or other restrictions specific to a particular decision module and/or to change goals specific to a particular decision module over time, with a new corresponding model being generated and deployed within such a decision module, including in some embodiments and situations while the corresponding automated control system continues control operations of a corresponding target system. In addition, while each automated control system is described as controlling a single target system in the examples of FIG. 41, other configurations may be used in other embodiments and situations, such as for a single automated control system to control multiple target systems (e.g., multiple inter-related target systems, multiple target systems of the same type, etc.), and/or multiple automated control systems may operate to control a single target system, such as by each operating independently to control different portions of that target control system. It will be appreciated that other configurations may similarly be used in other embodiments and situations.
FIG. 32 illustrates an embodiment of controlling a battery as part of a larger target system, which in this example is a system 3200 involving a home power system that includes a solar panel thus, the example embodiments discussed previously with respect to FIGS. 30 and/or 31 may in some situations be used as part of a larger system such as the example system 3200. In particular, the block diagram of FIG. 32 illustrates example components of an embodiment of a system 3200 for performing automated control of DC power from a battery that is part of a home electrical power system with solar power being generated, such as in a real-time manner and/or to optimize long-term operation of the battery, and with the home power generation and use being monitored and synchronized by an external entity, such as an entity providing or managing one or more CDI agents to control the battery of the system 3200. In the example of FIG. 32, the example home's power system is also connected to an external electrical grid from which it receives power and provides power at various times, with the battery serving to store electrical power generated by the solar power system and to supply power to the house and/or to the electrical grid as appropriate.
In the illustrated example of FIG. 32, components similar to those of FIG. 30 continue to be illustrated, including a battery 3010, a sensor module 3020, an actuator 3030 for the battery, an on-site battery tracking controller 3040, etc. In the example of FIG. 32, however, the CDI agent 3050 of FIG. 30 is not illustrated as part of the components present at the physical location of the example house, such as if the CDI agent in use with respect to FIG. 32 instead executes in a remote location (e.g., in the cloud or other computer network location) and provides tracking and/or synchronization signals to the battery tracking controller 3040 of FIG. 32 in a manner similar to that illustrated with respect to FIG. 30. Such tracking and/or synchronization signals may, for example, include desired power output of the battery and/or desired battery parameters (e.g., internal temperature, voltage, current, etc.) for a current time or immediately subsequent time. In addition, as discussed in greater detail elsewhere herein, the CDI agent(s) may generate such tracking and/or synchronization signals based on monitored information about the battery 3010 (and any other batteries being controlled), power requests from the utility managing the external electrical grid, defined constraints or other rules to be used, etc.
In addition, a number of additional components are illustrated in FIG. 32, including an inverter/rectifier module 3210 that receives output power from the battery and/or supplies electrical power to the battery for storage, a solar panel 3220 that generates electrical power and that has its own associated sensor and inverter, a distribution box 3230 that receives and/or supplies power to an external electrical grid and that controls power distribution to a load 3240 for the house, etc. In addition, two local control agents 3260 and 3270 are illustrated to assist in controlling operation of the battery tracking controller 3040 of FIG. 32, with Agent1 3260 interacting directly with the battery tracking controller, and Agent2 3270 performing activities to synchronize the AC phase of the power for the battery with that of the house power system and/or grid, such as to provide resonance for the power being received and/or provided. The battery tracking controller 3040 and agents 3260 and 3270 (other than the utility sensor processor) are together referred to as a ‘control processor’ in this example, with the battery tracking controller providing system status updates, and with communications between the agents being managed to support such a multi-agent architecture. The tomography of Agent2 tracks dynamic changes in the battery state using a non-destructive x-ray. In addition, an external entity 3280 (e.g., the utility providing or managing the external electrical grid) is providing monitoring and synchronization signals in this example to the battery tracking controller 3040, such as coordinate the power being used and/or provided via numerous such home power systems and/or other customers.
While the example of FIG. 32 involves use of the battery 3010 in a solar panel system, it will be appreciated that batteries may be charged and/or discharged in a variety of types of environments and systems, and similar activities of a corresponding CDI agent may be used to control such activities in the manner described herein.
FIG. 35 is a block diagram 3500 illustrating example visual displays of performance of an embodiment of a system that is using characteristics of a battery's state to perform automated control of DC power from the battery, with the home power generation and use being monitored and synchronized by an external entity, such as in the example of FIG. 32. In particular, in the example of FIG. 35, chart 3520 illustrates power requests that are received over time from a utility for one or more home power systems under the automated control of CDI agent(s), chart 3530 illustrates power that is supplied by the home power system(s) in response to the requests from the utility based on the automated control of the CDI agent(s), and chart 3510 combines the charts together. As is illustrated, the automated control of the CDI agent(s) provides highly accurate responses to the utility requests over time, while also optimizing performance of the one or more batteries in the home power system(s). In addition, chart 3540 further illustrates the incremental power supplied by the one or more batteries in the home power system(s) based on the automated control of the CDI agent(s).
FIG. 33 illustrates a further example of a system 3300 similar to that of system 3200 of FIG. 32, but in which the system 3300 is operating in a restricted emergency mode due to loss of monitoring/synchronization signals from the utility or other external entity that provides power requests. In particular, the block diagram of system 3300 illustrates a number of elements similar to that of system 3200 of FIG. 32, including a battery 3010 that is being controlled, a solar panel 3220, a distribution box 3230 and a house load 3240. However, in the example of FIG. 33, the signals from entity 3280 have been disrupted, and in response the control of the battery 3010 has been switched from the normal battery tracking controller 3040 of FIG. 32 to an alternative emergency battery tracking controller 3340 in FIG. 33. In addition, the components illustrated in FIG. 32 with respect to the Agent2 3270 are not illustrated in FIG. 33, as the system does not provide the phase synchronization and resonance functionality of FIG. 32 without the monitoring/synchronization signals that are missing in the situation of FIG. 33. In this example, the emergency battery tracking controller 3340 operates in a temporary fashion to maintain operation of the battery 3010 in the absence of the monitoring/synchronization signals, such as to receive information about the battery state and to issue corresponding instructions for the battery operation, such as based on battery state but without attempting to respond to power requests from the utility. Additional details are described below with respect to operation of the emergency controller in at least some embodiments.
FIG. 36 is a block diagram 3600 similar to the block diagram 3500 of FIG. 35, with respect to illustrating example visual displays of performance of an embodiment of a system that is using characteristics of a battery's state to perform automated control of DC power from the battery, but in which typical monitoring and synchronization of the home power generation and use is temporarily interrupted, such as in the example of FIG. 33. In particular, in the example of FIG. 36, chart 3620 illustrates power requests that are made over time by a utility for one or more home power systems under the automated control of CDI agent(s) but are not received by the home power systems due to the temporary interruption, and chart 3630 illustrates power that is supplied by the home power system(s) over time. Chart 3610 combines information together about the power supplied and the home power system's information about the utility's power requests, but with those power requests set to 0 during the time of temporary interruption since they are not received. Chart 3640 further illustrates the incremental power supplied by the one or more batteries in the home power system(s) based on the automated control of the CDI agent(s)—as is shown, despite the lack of utility power requests, the CDI agent(s) still control the power output of the one or more batteries based on other factors, such as the internal state of the one or more batteries, in order to continue to optimize performance of the one or more batteries in the home power system(s) to the extent possible based on the partial information that is available.
FIG. 34 is a block diagram illustrating example components of a model 3400 of a battery's state and operation, to enable automated control of DC power from the battery, such as in a real-time manner and to optimize long-term operation of the battery. In particular, the model 3400 of FIG. 34 includes representations of a charge path 3410 for the battery, a discharge path 3420 for the battery, a thermal model 3440 for the battery, a repository 3430, as well as a load 3450 that may be placed on the battery and supported by a nernst voltage 3480 produced by the battery. As discussed in greater detail elsewhere herein, in at least some embodiments and situations, the control of the battery may include preventing or minimizing the operation of the battery at a saturation level corresponding to the saturation model 3460, such as with the battery operating outside an optimal temperature range or other set of operational characteristics that decrease the life of the battery and/or impose other detrimental effects on the battery. By instead controlling the battery to operate in a linear range at all times or as much as is possible, various benefits are obtained, including increased battery life and/or other improved operational characteristics, as discussed in greater detail elsewhere herein.
FIG. 37 is a block diagram illustrating an embodiment of an example system 3700 for using characteristics of a battery's state to perform automated control of DC power from the battery, such as in a real-time manner and to optimize long-term operation of the battery. In particular, in a manner similar to that of system 3000 of FIG. 30, a CDI agent 3750 operates to control operation of a battery system 3710, with details of the battery system (e.g., a battery, sensor(s), actuators, battery tracking controller, etc.) not illustrated. Additional information 3740 is also illustrated to show information used and exchanged as part of controlling the battery system, including to maintain information about an internal state of the battery, and monitoring components to send and listen to various information being exchanged (e.g., as part of an information publish/subscribe or other push system) and used to update the internal state information.
FIG. 38 is a block diagram illustrating example components of an embodiment of a system 3800 for using characteristics of a battery's state to perform automated control of DC power from the battery, such as in a real-time manner and to optimize long-term operation of the battery. In particular, the illustrated system shows data flow into and between various components, including a state estimator component 3855 that receives information for a battery including various battery parameters (e.g., nernst potential, incremental value to saturation, capacity, etc.), other measurements (e.g., voltage, current, temperature, etc.) and incremental control applied to the battery (e.g., a desired power to provide, either as an absolute value or an incremental change), and estimates a corresponding state of the battery. The state estimator component also includes a parameter adaptation engine (PAE) 3865 in this example that adapts the incoming battery parameter information in one or more defined manners. The state estimator component provides output about the estimated state to the LQ tracker component 3845, which operates as a battery tracking controller to receive the estimated battery state information and other tracking parameters, to compute one or more optimal or approximately optimal control actions for the battery, and to output those control actions to be applied to the battery.
FIG. 39 is a block diagram illustrating example components of an embodiment of a system 3900 for using characteristics of a battery's state to perform automated control of DC power from the battery, such as in a real-time manner and to optimize long-term operation of the battery. In particular, FIG. 39 continues the example of FIG. 38, and provides further details regarding an example embodiment of the state estimator component 3855 of FIG. 38, which in the example of FIG. 39 is a discrete Kalman filter 3955. In this example, the filter 3955 takes as input information about the previously estimated battery parameters, initial values for the parameters, and information about incremental control actions taken for the battery, and uses the information to generate a new estimate of the battery parameters, so as to dynamically update the battery parameter estimates to capture ongoing changes in the battery.
FIG. 40 is a block diagram illustrating example components of an embodiment of a system 4000 that performs automated control of DC power from multiple batteries in a coordinated manner, such as in a real-time manner and to optimize long-term operation of the batteries. In particular, the system 4000 of FIG. 40 has some similarities to that of FIG. 32, but illustrates an example architecture of a system to support coordinated control of large numbers of batteries and associated systems (e.g., over one million such batteries and associated systems in this example, such as to correspond to one or more regions, states, countries, etc.). In particular, in the illustrated example, various batteries and associated systems 4005 (e.g., home power systems with solar panels) having on-site battery tracking controllers are illustrated, along with one or more utilities 4085 that provide power requests for the batteries and associated systems 4005, and one or more entities 4090 serving as system administration to manage a distributed control system 4075 for the batteries and associated systems 4005.
In this example, the distributed control system 4075 is implemented in a centralized manner in a network-accessible location, such as via an online computing environment (e.g., Microsoft Azure), although it may be implemented in other manners in other embodiments. The distributed control system 4075 includes one or more components to interface with and interact with the utilities 4085, one or more components to interface with and interact with the batteries and associated systems 4005, and one or more monitoring and/or configuration components with which the system administration entities 4090 may interact to monitor and/or control the distributed control system 4075. In addition, the various CDI agents that support the batteries and associated systems 4005 (e.g., with one CDI agent per battery pack and associated system) are executed in the network-accessible location and are clustered together, with various inter-cluster communication mechanisms used (e.g., a publish/subscribe system with various topics, a communication service bus between at least some CDI agents and/or clusters, etc.). The clusters may be formed in various manners in various embodiments, such as to group CDI agents based on having associated batteries and systems that share one or more characteristics, such as geographical location (e.g., being part of the system electrical grid substation area) and/or operating characteristics. In addition, the clusters may be used to coordinate the CDI agents in stages and/or tiers, such as to first coordinate the CDI agents within a cluster, then coordinate between two or more clusters, etc., and optionally with multiple tiers of clusters (e.g., structured in a hierarchical manner). Various additional components may be provided and used as part of the distributed control system 4075, such as a site management system to manage changes in CDI agents and/or batteries and associated systems (e.g., to add new CDI agents for new battery systems and/or to remove existing CDI agents for existing battery systems being removed from system 4075 management; to add new CDI agents to particular clusters and/or to remove existing CDI agents from clusters; to create, remove and modify clusters; etc.), storage services available from the network-accessible location to store state information and other information being used, resource management services available from the network-accessible location to manage computing resources provided by the network-accessible location, etc.
FIG. 42 is a block diagram illustrating example components of an embodiment of a system 4200 that performs automated control of DC power from multiple batteries in a coordinated manner, such as in a real-time manner and to optimize long-term operation of the batteries. In particular, FIG. 42 illustrates that a micro-grid network or other network of batteries and associated systems may be controlled and managed in at least some embodiments via a subset 4235 of the batteries and associated systems, such as if other of the batteries and associated systems are not part of a distributed control system. In such situations, a virtual network model may be created to model and estimate information about the micro-grid network or other network as a whole, including to estimate information about other battery and associated systems (referred to as a DER, or distributed energy resource, in this example) that are not being controlled.
FIG. 43 continues the example of FIG. 42, and includes additional information about the DER components being controlled in a coordinated manner. In particular, FIG. 43 is a block diagram illustrating example components of an embodiment of a system 4300 that performs automated control of DC power from multiple batteries in a coordinated manner, such as in a real-time manner and to optimize long-term operation of the batteries. The system 4300 includes a representation of a networked system 4345 analogous to distributed control system 4075 of FIG. 40 with multiple coordinated CDI agents, with the networked system 4345 also referred to as SERA (Smart Energy Reference Architecture) in this example. In this example, the networked system 4345 receives monitoring information from various DER components, and outputs synchronization signals to battery tracking controller components of the DER systems to control their operation.
FIG. 44 continues the examples of FIGS. 42 and 43, and includes additional information about the networked system 4345 of FIG. 43, such as to include information about a virtual network as noted with respect to FIG. 42, in order to simulate or otherwise estimate information about operation of the overall system of batteries and associated CDI agents. In particular, FIG. 44 is a block diagram illustrating example components of an embodiment of a system 4400 that performs automated control of DC power from multiple batteries in a coordinated manner, such as in a real-time manner and to optimize long-term operation of the batteries. The system 4400 includes models or other visual representations of various elements that include a virtual generation plant, virtual load, virtual substation(s), and virtual DER(s), with one or more CDI agents associated with each. The various elements exchange information as shown, including to estimate or otherwise model operation of the overall system as a whole, even in situations in which only a subset of the DER components are being controlled. The various CDI agents further use a mean field representation to coordinate their actions, as discussed in greater detail elsewhere herein.
FIG. 45 illustrates a system 4500 similar to system 3000 of FIG. 30, in which various components interact to control operations of the battery according to defined criteria, but with additional elements 4555 illustrated in FIG. 45. In particular, FIG. 45 is a block diagram illustrating example components of an embodiment of a system 4500 for using characteristics of a battery's state to perform automated control of DC power from the battery, such as in a real-time manner and to optimize long-term operation of the battery, as well as to use price forecasts and additional information to enhance financial performance of the system. The additional elements 4555 of FIG. 45 allow the system 4500 to obtain and use information about forecasted prices for power supplied to the utility, such as based on weather data, past and current price data, etc., such as to further manage the control of the battery to optimize or otherwise enhance one or more financial constraints, such as in combination with other constraints related to battery life and/or other performance characteristics.
Additional details regarding an example embodiment of a typical battery tracking controller that uses monitoring/synchronization signals from an external entity (also referred to as a ‘hybrid tracker’) are as follows.
One example embodiment of a tracking control system for a generic lithium ion high power battery cell may be modeled by the electric circuit representation in FIG. 34. The model 3400 includes current and voltage sources to represent the chemical reactions that characterize the cell in charging and discharging operations. The control actions are mediated by an actuator driven by a tracking controller. The Controller tracks a desired power signal that is generated by an inference module determining the desired response of the cell as a function of current electric storage level, and power demand. The inverter or rectifier circuits are not modeled in this example, and an idealized actuator is used for this model.
The dynamic behavior of the circuit in FIG. 34 is given by a differential equation as follows:
{dot over (x)}(t)=G(x(t),u(t),parameters) (1)
where the state
$x (t) = [\begin{matrix} power (t) \\ voltage (t) \\ current (t) \\ temperature (t) \end{matrix}] \in R^{4},$
and the control u(t)εR, and the time line is represented by tεR.
The function G(x₁(t),x₂(t),x₃(t),x₄(t),u(t)) is given by (2)
$\begin{matrix} [\begin{matrix} G_{1} (x_{1} (t), x_{2} (t), x_{3} (t), x_{4} (t), u (t)) \\ G_{2} (x_{1} (t), x_{2} (t), x_{3} (t), x_{4} (t), u (t)) \\ G_{3} (x_{1} (t), x_{2} (t), x_{3} (t), x_{4} (t), u (t)) \\ G_{4} (x_{1} (t), x_{2} (t), x_{3} (t), x_{4} (t), u (t)) \end{matrix}] = [\begin{matrix} α_{1} x_{1} (t) x_{2} (t) + α_{2} x_{1} (t) u (t) + β_{1} x_{1} (t) + λ_{1} x_{4} (t - τ) \\ β_{2} x_{2} (t) - β_{3} x_{1} (t) + φ_{1} u (t) + λ_{2} x_{4} (t - τ) \\ β_{4} x_{3} (t) + β_{5} x_{2} (t) + β_{6} x_{1} (t) + λ_{3} x_{4} (t - τ) \\ β_{7} x_{1} (t) + β_{8} x_{2} (t) + β_{9} x_{3} (t) + φ_{2} u (t) + λ_{4} x_{4} (t - τ) \end{matrix}] & (2) \end{matrix}$
For simplicity the dependence on the parameters in the argument have been suppressed. The parameters: α₁, α₂, β₁, . . . , β₉, φ₁, φ₂, λ₁, . . . , λ₄represent the physical components in the cell model (i.e., resistors, the capacitor, the voltage, current sources and the saturation limits). The parameter τ is a time delay, which can be estimated using the historical data. The parameters: α₁, α₂, β₁, . . . , β₆, φ represent the physical components in the cell model (i.e., resistors, the capacitor, the voltage, current sources and the saturation limits). The control design generates an approximate solution of (1) by a piecewise linear stochastic differential equation over small time intervals.
Thus, let t₀, t₁, . . . t_i, t_i+1, . . . , be a partition of the time line. On each interval [t_i,t_i+1), we seek solutions of the form
x(t)=x(t _i)+δ{circumflex over (x)} (3)
{dot over (x)}(t)={dot over (δ)}{circumflex over (x)} (4)
over [t_i,t_i+1), where δ{circumflex over (x)}(t) is the conditional mean of δx(t), and {dot over (δ)}{circumflex over (x)}(t) is the conditional rate, obtained from a Kalman filter based on the following piecewise linear stochastic model. The stochastic increment δx(t) satisfies the following stochastic differential equation:
$\begin{matrix} d δ x (t) = \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial x} δ x (t) dt + \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial u} δ u (t) dt + G (x (t_{i}), u (t_{i})) dt + d ω (t) & (5) \end{matrix}$
where the noise has zero mean and the covariance is proportional to the second order term that is taken out in the approximation, e.g.,
$ω (t) \sim N (0, α \frac{\partial^{2} \sum_{j = 1}^{3} G_{j} (x (t_{i}), u (t_{i}))}{\partial x^{2}} + ɛ I),$
where ε>0 and I is the identity matrix.
Sensors provide power, voltage, and current measurements. The observations from the battery sensors are modeled by
y(t)=x(t)+θ(t)
where the observation noise θ(t) is characteristic of the sensors, with zero mean and covariance matrix determined from the signal to noise ratio specifications. From (3) and (5) the incremental observation is given by
δy(t)=δx(t)+θ(t)
with δy(t)=y(t)−x(t_i).
An effective of the tracking problem has a criterion of the form
$\begin{matrix} \min_{δ \dot{u} (t)} E \int_{t_{i}}^{t_{i + 1}} \frac{1}{2} [{(δ x (t) - {\tilde{x}}_{δ} (t))}^{T} Q_{0} (δ x (t) - {\tilde{x}}_{δ} (t)) + δ {\dot{x} (t)}^{T} Q_{1} δ \dot{x} (t) + δ {u (t)}^{T} R_{0} δ u (t) + δ {\dot{u} (t)}^{T} R_{1} δ \dot{u} (t)] dt + \frac{1}{2} δ {x (t_{i + 1})}^{T} F_{0} δ x (t_{i + 1}) . & (6) \end{matrix}$
The tracking value {tilde over (x)}_δ(t) is generated from rules defining the desired power behavior of the battery, dynamically. Note that rules from a CDI agent may not be given in terms of the original state vector (power, current, voltage, temperature), and if not are translated to state desired behavior in terms of the desired incremental state {tilde over (x)}_δ(t). A first example of such rules is given below.
Rule 1: over a week, at least 78% of the power demand should be satisfied.
Rule 2: battery longevity≧five years.
Rule 3: for the battery, satisfy thresholds on amount of charge and discharge.
A second example of such rules is given below.

- 1. Maximum charge limit: Do not charge the battery if the current charge has exceeded a first defined threshold level.
- 2. Minimum charge limit: Do not discharge the battery if the current charge is below a second defined threshold level.
- 3. Rate limitation: Do not change the desired power in/out to battery faster than a third defined threshold limit for rate of change.
- 4. Switching between charge and discharge: Reduce rate of charge when power is near zero to prevent switching between charging and discharging (or vice versa) too fast or frequently, such as based on one or more fourth defined threshold levels.
- 5. Maintain battery temperature: If the temperature begins to rise, adjust desired power output based on current battery state (i.e., if battery charge is low and the temperature starts increasing, then charge it).
  As part of these rules of the second example, fulfill power requests as much as possible (referred to at times herein as maximizing the “Q factor”), such as, for example, at a rate of 70%. If power requests are being satisfied at more and/or at less than a desired level (e.g., 30%, 40%, 50%, 60%, 70%, 80%, 90%, etc.), weighting with one or more of the rules above may be adjusted up or down in some embodiments to lower or raise the level of satisfied power requests, respectively, and to correspondingly increase or reduce battery life, respectively, from such changes.
  The rules here pertain to the increments, and convert the average over a long time frame (weeks, years) to a running average over [t_i, t_i+1]. For the battery model, the temperature typically changes much faster than power, voltage and current. Motivated by Aoki's partitioning method, the problem may be decoupled through the state variables so that the full state space problem can be transformed into two sub-problems which are solved with different time intervals. The temperature controller is referred to as “a high speed controller” and the power/voltage/current controller “a low speed controller”, with the following details directed to the low speed controller. In particular, an optimal control tracking problem to be solved for the low speed controller may be summarized as:

$\begin{matrix} \min_{δ \dot{u} (t)} E \int_{t_{i}}^{t_{i + 1}} \frac{1}{2} [{(δ x (t) - {\tilde{x}}_{δ} (t))}^{T} Q_{0} (δ x (t) - {\tilde{x}}_{δ} (t)) + δ {\dot{x} (t)}^{T} Q_{1} δ \dot{x} (t) + δ {u (t)}^{T} R_{0} δ u (t) + δ {\dot{u} (t)}^{T} R_{1} δ \dot{u} (t)] dt + \frac{1}{2} δ {x (t_{i + 1})}^{T} F_{0} δ x (t_{i + 1}) subject to & (6) \\ d δ x (t) = \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial x} δ x (t) dt + \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial u} δ u (t) dt + G (x (t_{i}), u (t_{i})) dt + λ T (t - τ) dt + d ω_{0} (t) & (7) \\ d^{2} δ x (t) = \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial x} d δ x (t) dt + \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial u} δ \dot{u} (t) dt + d ω_{1} (t) & (8) \\ δ \dot{u} (t) = v (t) . & (9) \end{matrix}$
In (7),
$λ = [\begin{matrix} λ_{1} \\ λ_{2} \\ λ_{3} \end{matrix}],$
and T(t−τ) is the average temperature for the previous time interval t_i−1≦t≦t_i. So λT(t−τ) can be considered a constant for t_i≦t≦t_i+1. The optimization problem formulated in (6)-(9) satisfies the assumptions of the separation principle. This leads to the following approach.
Step 1: Determine the conditional mean δ{circumflex over (x)}(t) and the conditional rate δ{circumflex over ({dot over (x)})}(t) for t_i≦t≦t_i+1generated by a Kalman filter, which is described in a separate note.
Step 2: Solve (6)-(9), obtaining a feedback solution of the form
δ{dot over (u)}(t)=K ₀(t)δx(t)+K ₁(t)δ{dot over (x)}(t)+K ₂(t)δu(t)+ψ(t) (10)
where K_j(t), j=0, 1, 2 and ψ(t) are the gains and the affine terms resulting from the optimization.
Step 3. Replace δx(t) and δ{dot over (x)}(t) in (10) with δ{circumflex over (x)}(t) and {dot over (δ)}{circumflex over (x)}(t) from the outputs of the Kalman filter.
Step 4. Integrate the following equation,
δ{dot over (u)}(t)=K ₀(t)δ{circumflex over (x)}(t)+K ₁(t)δ{circumflex over ({dot over (x)})}(t)+K ₂(t)δu(t)+ψ(t). (11)
The intervals [t_i, t_i+1] for all i, are chosen small enough so that the gains K_j(t), j=0, 1, 2 and ψ(t) can be considered constant over each interval, and are evaluated at t_i.
Using the variation of constant formula, the integral of (11) at t_i+1 ⁻is given by
δu(t _i+1)=∫_t _i ^t ⁱ⁺¹ e ^K ² ^(t ⁱ ^)(t ⁱ⁺¹ ^−τ)(K ₀(τ)δ{circumflex over (x)}(τ)+K ₁(τ){dot over (δ)}{circumflex over (x)}(τ)+ω(τ))dτ (12)
Integrate (12) using an impulsive approximation assuming that the integrands are impulses att_i. Thus, the incremental control for the low speed controller is
δu(t _i+1 ⁻)≈e ^K ² ^(t ⁱ ^)(t ⁱ⁺¹ ^−t ⁱ ⁾(K ₀ a(t _i)δ{circumflex over (x)}(t _i)+K ₁(t _i){dot over (δ)}{circumflex over (x)}(t _i)+ψ(t _i)). (13)
Solving for the high speed controller, we get the incremental control for the high speed controller, δu_H(t_i+1 ⁻). Taking the linear combination of the two incremental controls, we get
δu(t _i+1 ⁻)=ε₁ δu _L(t _i+1 ⁻)+ε₂ δu _H(t _i+1 ⁻). (14)
The control to the battery at t_i+1is given by
u(t _i+1)=u(t ₁)+δu(t _i+1 ⁻). (15)
The state at t_i+1is
x(t _i+1)=x(t _i)+δ{circumflex over (x)}(t _i+1). (16)
the rate of the state at t_i+1is
{dot over (x)}(t _i+1)={dot over (x)}(t _i)+{dot over (δ)}{circumflex over (x)}(t _i+1). (17)
Both (16) and (17) are sent to the rules module for future requirements to update {tilde over (x)}_δ(t). This approach is repeated on the next interval. This is illustrated in FIG. 47, showing the architecture 4700 of the battery controller.
The model of the cell may be trained, for example, with the following parameters, which are typical of a 1 kW rated Lithium ion battery.
Parameter Table

α₁ −0.001

α₂ −0.002

β₁ −0.01

β₂ −0.01

β₃ 0.001

β₄ −0.01

β₅ 0.01

β₆ 0.001

φ 0.01

With respect to the battery operation, the controller for this example compensate very well for the uncertainty in the parameters and approximation errors, and exhibits good quality of robustness and time response.
Additional details regarding an example embodiment of continuously estimating state for a battery in use, such as for a generic lithium ion high power battery cell and for a low speed controller, are as follows:
The stochastic differential equation for the state update is:
dδx(t)=A(t)δx(t)dt+B(t)δu(t)dt+f(t)dt+dω(t) (1)
where the state δx(t)εR³, u(t)εR¹, A(t) is a 3×3 matrix, B(t) is a 3×1 vector, ƒ(t) is a 3×1 vector and ω(t) is a 3×1 vector. The noise has zero mean and the covariance matrix W, e.g., ω(t)˜N(0, W).
The observation equation is given by
δy(t)=δx(t)+θ(t) (2)
where the measurement δy(t)εR³, and θ(t) is a 3×1 vector. The noise has zero mean and the covariance matrix V, e.g., θ(t)˜N(0, V).
The state update equation is:
δ{circumflex over ({dot over (x)})}(t)=A(t)δ{circumflex over (x)}(t)+B(t)δu(t)+K(t)(δy(t)−δ{circumflex over (x)}(t)) (3)
where the Kalman gain K is a 3×3 matrix, given by
K(t)=P(t)V ⁻¹ (4)
and the covariance update equation is:
{dot over (P)}(t)=A(t)P(t)+P(t)A(t)T+W−P(t)V ⁻¹ P(t) (5)
In order to determine the conditional mean δ{circumflex over (x)}(t) and the conditional rate δ{circumflex over ({dot over (x)})}(t) for t_i≦t<t_i+1, the first step is to initialize δ{circumflex over (x)}(t_i)=0 and P(t_i)=0.
The second step is to solve (5) numerically (e.g., Runge-Kutta method) for P(t), for t_i≦t<t_i+1. The third step is to get the Kalman gain, using (4).
The fourth step is to solve (3) numerically, where the parameters A(t), B(t) and the observations δy(t) are known. This provides the conditional mean δ{circumflex over (x)}(t) and the conditional rate δ{circumflex over ({dot over (x)})}(t) for t_i≦t<t_i+1.
Additional details regarding an example embodiment of computing feedback gain for a battery in use, such as for a generic lithium ion high power battery cell and for a low speed controller as described above, are as follows:
An optimal control tracking problem to be solved may be summarized as:
$\begin{matrix} \min_{δ \dot{u} (t)} E \int_{t_{i}}^{t_{i + 1}} \frac{1}{2} [{(δ x (t) - {\tilde{x}}_{δ} (t_{i}))}^{T} Q_{0} (δ x (t) - {\tilde{x}}_{δ} (t_{i})) + δ {\dot{x} (t)}^{T} Q_{1} δ \dot{x} (t) + δ {u (t)}^{T} R_{0} δ u (t) + δ {\dot{u} (t)}^{T} R_{1} δ \dot{u} (t)] dt + \frac{1}{2} δ {x (t_{i + 1})}^{T} F_{0} δ x (t_{i + 1}) subject to & (1) \\ d δ x (t) = \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial x} δ x (t) dt + \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial u} δ u (t) dt + G (x (t_{i}), u (t_{i})) dt + λ T (t - τ) dt + d ω_{0} (t) & (2) \\ d^{2} δ x (t) = \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial x} d δ x (t) dt + \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial u} δ \dot{u} (t) dt + d ω_{1} (t) & (3) \\ δ \dot{u} (t) = v (t) . & (4) \end{matrix}$
Define the new state variable
$z (t) = [\begin{matrix} δ x (t) \\ δ \dot{x} (t) \\ δ u (t) \end{matrix}]$
and the new control variable v(t)=δ{dot over (u)}(t), the state equation in (2-4) can be rewritten as
{dot over (z)}(t)={tilde over (A)}(t _i)z(t)+{tilde over (B)}(t _i)v(t)+{tilde over (f)}(t _i)
with the initial conditions z(t_i)=0, and
$\tilde{A} (t_{i}) = [\begin{matrix} \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial x} & 0 & \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial u} \\ 0 & \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial x} & 0 \\ 0 & 0 & ε I \end{matrix}], \tilde{B} (t_{i}) = [\begin{matrix} 0 \\ \frac{\partial G (x (t_{i}), u (t_{i}))}{\partial u} \\ 1 \end{matrix}], \tilde{f} (t_{i}) = [\begin{matrix} G (x (t_{i}), u (t_{i})) + λ T (t - τ) dt \\ 0 \\ 0 \end{matrix}] .$
The optimal control tracking problem to be solved is summarized as:
$\begin{matrix} \min_{v (t)} E \int_{t_{i}}^{t_{i + 1}} \frac{1}{2} [{(z (t) - \tilde{z} (t_{i}))}^{T} Q (z (t) - \tilde{z} (t_{i})) + {v (t)}^{T} Rv (t)] dt + \frac{1}{2} {z (t_{i + 1})}^{T} Fz (t_{i + 1}) subject to dz (t) = \tilde{A} (t_{i}) z (t) dt + \tilde{B} (t_{i}) v (t) dt + \tilde{f} (t_{i}) dt + d ω (t) with Q = [\begin{matrix} Q_{0} & 0 & 0 \\ 0 & Q_{1} & 0 \\ 0 & 0 & R_{0} \end{matrix}], R = R_{1}, F = [\begin{matrix} F_{0} & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}], \tilde{z} (t_{i}) = [\begin{matrix} {\tilde{x}}_{δ} (t_{i}) \\ 0 \\ 0 \end{matrix}] . The feedback law v (t) = - R^{- 1} {\tilde{B}}^{T} p (t) with  p (t) = Σ (t) z (t) + φ (t) . & (5) \end{matrix}$
Σ(t) is a matrix, and φ(t) is a vector. They are provided by solving the following ordinary differential equations.
{dot over (Σ)}(t)=−Σ(t){tilde over (A)}(t _i)−{tilde over (A)}(t _i)^TΣ(t)+Σ(t){tilde over (B)}(t _i)R ⁻¹ {tilde over (B)}(t _i)^TΣ(t)−Q with Σ(t _i+1)=F (6)
{dot over (φ)}(t)=(−{tilde over (A)}(t _i)^T+Σ(t){tilde over (B)}(t _i)R ⁻¹ {tilde over (B)}(t _i)^T)φ(t)−Σ(t){tilde over (f)}(t _i)+Q{tilde over (z)}(t _i) with φ(t _i+1)=0. (7)

From (5),

v(t)=−R ⁻¹ {tilde over (B)} ^T(Σ(t)z(t)+φ(t))=−R ⁻¹ {tilde over (B)} ^TΣ(t)z(t)−R ⁻¹ {tilde over (B)} ^Tφ(t)=K _LQ(t)z(t)+ψ(t) (8)
where K_LQ(t)=−R⁻¹{tilde over (B)}^TΣ(t), ψ(t)=−R⁻¹{tilde over (B)}^Tφ(t).
By the separation principle, we can calculate the K_LQ(t) and ψ(t) by solving the deterministic problem first, then replace the state z(t) in (8) with
$\hat{z} (t) = [\begin{matrix} δ \hat{x} (t) \\ δ \hat{\dot{x}} (t) \\ δ u (t) \end{matrix}]$
to get v(t), as shown in (9).
v(t)=δ{dot over (u)}(t)=K ₀(t)δ{circumflex over (x)}(t)+K ₁(t)δ{dot over ({circumflex over (x)})}(t)+K ₂(t)δu(t)+ψ(t) (9)
Using an impulsive approximation assuming that the integrands are impulses at t_i, yields
δu(t _i+1 ⁻)≈e ^K ² ^(t ⁱ ^)(t ⁱ⁺¹ ^−t ⁱ ⁾(K ₀(t _i)δ{circumflex over (x)}(t _i)+K ₁(t _i)δ{circumflex over ({dot over (x)})}(t _i)+ψ(t _i)) (10)
To get the state value for the deterministic problem, solve the following differential equation.
$\dot{z} (t) = \tilde{A} (t_{i}) z (t) + \tilde{B} (t_{i}) (K_{LQ} (t_{i}) z (t) + ψ (t_{i})) + \tilde{f} (t_{i}) = (\tilde{A} (t_{i}) + \tilde{B} (t_{i}) K_{LQ} (t_{i})) z (t) + \tilde{B} (t_{i}) ψ (t_{i}) + \tilde{f} (t_{i})$
An architecture for such operations is illustrated in system 5000 of FIG. 50.
Additional details regarding an example embodiment of an emergency battery tracking controller are as follows:
The state space model of the battery system:
δ{dot over (x)}(t)=Aδx(t)+Bδu(t)+f (1)
y(t)=Cδx(t) (2)
The state observer is of the form:
δ{circumflex over ({dot over (x)})}(t)=Aδ{circumflex over (x)}(t)+Bδu(t)+f+K(y(t)−Cδ{circumflex over (x)}(t)) (3)
where {circumflex over (x)}(t) is the estimate of x(t) and K is the latest Kalman filter gain computed by the hybrid tracker before we switch to the closed-loop tracker.
Equation (3) can be rewritten as follows:
δ{circumflex over ({dot over (x)})}(t)=(A−KC)δ{circumflex over (x)}(t)+Bδu(t)+f+Ky(t) (4)
We take the Laplace transform of both sides of the equation:
sδ{circumflex over (x)}(s)−δ{circumflex over (x)}(0)=(A−KC)δ{circumflex over (x)}(s)+Bδu(s)+f(s)+Ky(s) (5)
where δ{circumflex over (x)}(0)=0 and
$f (s) = \frac{f}{s} .$
Solving for δ{circumflex over (x)}(s), we find:
δ{circumflex over (x)}(s)=(sI−A+KC)⁻¹ [Bδu(s)+f(s)+Ky(s)] (6)
Multiplying by C on both sides of (6) and replacing Cδ{circumflex over (x)}(s) with y(s), we get:
y(s)=C(sI−A+KC)⁻¹(Bδu(s)+f(s)+Ky(s)) (7)
Solving for y(s), we find:
(I−C(sI−A+KC)⁻¹ K)y(s)=C(sI−A+KC)⁻¹(Bδu(s)+f(s)) (8)
y(s)=[I−C(sI−A+KC)⁻¹ K] ⁻¹ C(sI−A+KC)⁻¹(Bδu(s)+f(s)) (9)
Let
G _u(s)=[I−C(sI−A+KC)⁻¹ K] ⁻¹ C(sI−A+KC)⁻¹ B (10)
G _f(s)=[I−C(sI−A+KC)⁻¹ K] ⁻¹ C(sI−A+KC)⁻¹ (11)
then,
y(s)=G _u(s)δu(s)+G _f(s)f(s) (12)
Let y_ref(t) be a reference signal, which can be obtained by setting δ{dot over (k)}(t)=0 and δu(t)=0 in (18), and then solving for δx(t), yielding
y _ref(t)=−(A ^T A)⁻¹ A ^T f (13)
Design a PI controller, such that
$\begin{matrix} δ u (s) = (K_{p} + \frac{K_{i}}{s}) (y_{ref} (s) - y (s)) & (14) \end{matrix}$
where K_Kis the proportional gain and K_iis the integral gain to be determined, and
y _ref(s)=−(A ^T A)⁻¹ A ^T f/s.
The feedback controller 4800 is shown in FIG. 48. Combine (12) and (24), yielding
$y (s) = G_{u} (s) (K_{p} + \frac{K_{i}}{s}) (y_{ref} (s) - y (s)) + G_{f} (s) f (s) [I + G_{u} (s) (K_{p} + \frac{K_{i}}{s})] y (s) = G_{u} (s) (K_{p} + \frac{K_{i}}{s}) y_{ref} (s) + G_{f} (s) f (s)$
Solving for y(s), we get:
$\begin{matrix} y (s) = {[I + G_{u} (s) (K_{p} + \frac{K_{i}}{s})]}^{- 1} G_{u} (s) (K_{p} + \frac{K_{i}}{s}) y_{ref} (s) + {[I + G_{u} (s) (K_{p} + \frac{K_{i}}{s})]}^{- 1} G_{f} (s) f (s) & (15) \end{matrix}$
Now stabilize the feedback control system with K_pand K_ito be determined. The closed-loop transfer function is:
$\begin{matrix} T (s) = {[I + G_{u} (s) (K_{p} + \frac{K_{i}}{s})]}^{- 1} G_{u} (s) (K_{p} + \frac{K_{i}}{s}) = [I + {({[I - {C (sI - A + KC)}^{- 1} K]}^{- 1} {C (sI - A + KC)}^{- 1} B (K_{p} + \underset{s}{K_{i}})]}^{- 1} & (16) \\ ({[I - {C (sI - A + KC)}^{- 1} K]}^{- 1} {C (sI - A + KC)}^{- 1} B (K_{p} + \frac{K_{i}}{s}) & (17) \end{matrix}$
Chose K_Kand K_isuch that all the poles of T(s) located in the open left half of the s-plane.

Numerical Implementation

The numerical algorithm to compute K_pand K_iis shown below. As previously mentioned, the state space model of the battery system is as follows:
δ{dot over (x)}(t)=Aδx(t)+Bδu(t)+f (18)
δy(t)=Cδx(t) (19)
The state observer is of the form:
δ{circumflex over ({dot over (x)})}(t)=Aδ{circumflex over (x)}(t)+Bδu(t)+f+K(δy(t)−Cδ{circumflex over (x)}(t)) (20)
where {circumflex over (x)}(t) is the estimate of x(t) and K is the latest Kalman filter gain computed by the hybrid tracker before we switch to the closed-loop tracker. δy(t) is obtained from the history measurement data. Equation (20) can be rewritten as follows:
δ{circumflex over ({dot over (x)})}(t)=(A−KC)δ{circumflex over (x)}(t)+Bδu(i)+f+Kδy(t) (21)
Equation (21) can be approximated by
δ{circumflex over (x)}(t+Δ)=e ^(A−KC)Δ δ{circumflex over (x)}(t)+∫_t ^t+Δ e ^{(A−KΔ)(t+Δ−τ)}(Bδu(τ)τf+Kδy(τ))dτ (22)
Assuming the integrands are impulses at t, we have
$\begin{matrix} \begin{matrix} δ \hat{x} (t + Δ) = e^{(A - KC) Δ} δ \hat{x} (t) + e^{(A - KC) Δ} (B δ u (t) + f + K δ y (t)) \\ = e^{(A - KC) Δ} (δ \hat{x} (t) + B δ u (t) + f + K δ y (t)) \end{matrix} & (23) \end{matrix}$
Design a PI controller, such that
δu(t)=K _p(δy _ref −δy(t))+K _i∫₀ ^t(δy _ref −δy(τ))dτ (24)
where K_pis the proportional gain and K_iis the integral gain to be determined, δy_refis a reference signal, which can be obtained by setting δ{dot over (x)}(t)=0 and δu(t)=0 in (18), and then solving for δx(t), yielding
δy _ref=(A ^T A)⁻¹ A ^T f (25)
Solve a least squares problem for K_pand K_isuch that the integral of squared residuals ∫_t=0 ^T(δy_ref−Cδ{circumflex over (x)}(t))^T(δy_ref−Cδ{circumflex over (x)}(t)) is minimized, subject to equation (23), (24) and (25).
$\begin{matrix} \min_{K_{p}, K_{i}} \sum_{n = 0}^{N} {(δ y_{ref} - C δ \hat{x} (n Δ))}^{T} (δ y_{ref} - C δ \hat{x} (n Δ)) & (26) \\ s . t . δ u (n Δ) = K_{p} (δ y_{ref} - δ y (n Δ)) + K_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y (i Δ)) & (27) \\ δ \hat{x} ((n + 1) Δ) = e^{(A - KC) Δ} (δ \hat{x} (n Δ) + B δ u (n Δ) + f + K δ y (n Δ)) & (28) \\ δ \hat{x} (0) = 0, \forall n = 0 \dots N . & (29) \end{matrix}$
where, by Nyquist, N is determined by the minimum eigenvalues of the state transition matrix A, λ_min, i.e.,
$\begin{matrix} N = (5 \sim 10) \frac{1}{λ_{\min}}, & (30) \end{matrix}$
Let Ã=e^(A−KC)Δ, δ{circumflex over (x)}_n=δ{circumflex over (x)}(nΔ), δy_n=δy(nΔ) and replace δu(nΔ) in (28) with (27), yields:
$\begin{matrix} \min_{K_{p}, K_{i}} \sum_{n = 0}^{N} {(δ y_{ref} - C δ {\hat{x}}_{n})}^{T} (δ y_{ref} - C δ {\hat{x}}_{n}) & (31) \\ s . t . δ {\hat{x}}_{n + 1} = \tilde{A} (δ {\hat{x}}_{n} + {BK}_{p} (δ y_{ref} - δ y_{n}) + {BK}_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y_{i}) + f + K δ y_{n}) & (32) \\ δ {\hat{x}}_{0} = 0, \forall n = 0 \dots N . & (33) \end{matrix}$
Solve the problem using a dynamic programming approach. The cost-to-go function is written as:
$\begin{matrix} V (δ {\hat{x}}_{n}, n) = \min_{K_{p}, K_{i}} {{(δ y_{ref} - C δ {\hat{x}}_{n})}^{T} (δ y_{ref} - C δ {\hat{x}}_{n}) + V ({\hat{x}}_{n + 1}, n + 1)} & (34) \\ = \min_{K_{p}, K_{i}} {{(δ {\hat{x}}_{n})}^{T} C^{T} C δ {\hat{x}}_{n} - 2 {(δ {\hat{x}}_{n})}^{T} C^{T} δ y_{ref} + {(δ y_{ref})}^{T} δ y_{ref} + V (δ {\hat{x}}_{n + 1}, n + 1)} & (35) \end{matrix}$
with V({circumflex over (x)}_N,N)=(δ{circumflex over (x)}_N)^TC^TCδ{circumflex over (x)}_N−2(δ{circumflex over (x)}_N)^TC^Tδy_ref+(δy_ref)^Tδy_ref, and equate it to the Riccati form
V(δ{circumflex over (x)} _n ,n)=(δ{circumflex over (x)} _n)^TΦ_n δ{circumflex over (x)} _n(δ{circumflex over (x)} _n)^TΨ_n+Ω_n (36)
where Φ_nrepresents a symmetric positive definite matrix, Φ_nis a positive vector, and Ω_nis a positive scalar. Combining the equations (35), (36) and the dynamics in (32), yields
$\begin{matrix} V (δ {\hat{x}}_{n}, n) = \min_{K_{p}, K_{i}} {{(δ {\hat{x}}_{n})}^{T} C^{T} C δ {\hat{x}}_{n} - 2 {(δ {\hat{x}}_{n})}^{T} C^{T} δ y_{ref} + {(δ y_{ref})}^{T} δ y_{ref} + {(δ {\hat{x}}_{n + 1})}^{T} Φ_{n} δ {\hat{x}}_{n + 1} + {(δ {\hat{x}}_{n + 1})}^{T} Ψ_{n + 1} + Ω_{n + 1}} = \min_{K_{p} K_{i}} {{(δ {\hat{x}}_{n})}^{T} C^{T} C δ {\hat{x}}_{n} - 2 {(δ {\hat{x}}_{n})}^{T} C^{T} δ y_{ref} + {(δ y_{ref})}^{T} δ y_{ref} & (37) \\ + {(δ {\hat{x}}_{n} + {BK}_{p} (δ y_{ref} - δ y_{n}) + {BK}_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y_{i}) + f + K {δy}_{n})}^{T} {\tilde{A}}^{T} Φ_{n} \tilde{A} (δ {\hat{x}}_{n} + {BK}_{p} (δ y_{ref} - δ y_{n}) + {BK}_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y_{i}) + f + K δ y_{n}) + {(δ {\hat{x}}_{n} + {BK}_{p} (δ y_{ref} - δ y_{n}) + {BK}_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y_{i}) + f + K δ y_{n})}^{T} Ψ_{n + 1} + Ω_{n + 1}} & (38) \end{matrix}$
In order to minimize this expression, isolate the terms with K_pand K_iin them,
$\begin{matrix} {({BK}_{p} (δ y_{ref} - δ y_{n}))}^{T} {\tilde{A}}^{T} Φ_{n + 1} \tilde{A} ({BK}_{p} (δ y_{ref} - δ y_{n})) + 2 {({BK}_{p} (δ y_{ref} - δ y_{n}))}^{T} {\tilde{A}}^{T} Φ_{n + 1} \tilde{A} ({BK}_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y_{i}) + δ {\hat{x}}_{n} + f + K {δy}_{n}) + {({BK}_{p} (δ y_{ref} - δ y_{n}))}^{T} Ψ_{n + 1} + {({BK}_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y_{i}))}^{T} {\tilde{A}}^{T} Φ_{n + 1} \tilde{A} ({BK}_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y_{i})) + 2 {({BK}_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y_{i}))}^{T} {\tilde{A}}^{T} Φ_{n + 1} \tilde{A} ({BK}_{p} (δ y_{ref} - δ y_{n}) + δ {\hat{x}}_{n} + f + K δ y_{n}) & (39) \\ + {({BK}_{i} \sum_{i = 0}^{n} (δ y_{ref} - δ y_{i}))}^{T} Ψ_{n + 1} & (40) \end{matrix}$
and take the derivative with respect to each element of K_pand K_i, and set the value to 0. This yields the solution for the optimal gain K_p*, and K_i* with respect to Φ_n, Φ_nand Ω_n. Equating the Riccati form (36) with the value function in (38) evaluated at K_p* and K_i*, solve for Φ_n, Φ_nand Ω_n, thereafter, numerical values for K_p* and K_i*.
Additional details regarding an example embodiment of controlling a battery as part of a network of multiple home solar power systems are as follows, which in this example use functionality to support Internet Of Things (IoT) capabilities in an online network environment such as Microsoft's Azure:

Architecture

In this example, the architecture is designed to control millions of batteries at multiple sites. The architecture consists of the following components:

- Battery control units. These reside on-site and allow for control of batteries even in the case where network connection is lost
- Battery interface (Azure IoT hub). This allows sending and receiving of data to each battery control unit
- Utility interface (Azure IoT hub). Allows sending and receiving of data to the utility company.
- CDI agents. There is one agent for each battery to control; these communicate to the batteries over the battery interface, and with each other to determine an optimal control.
- Site management. This component adapts the network of CDI agents to allow for batteries to be added or removed from control automatically
- Monitoring. This component tracks vital statistics of the CDI agents to allow a user to verify that everything is working, and to provide diagnostics to take action if a problem occurs.

Battery Control Units

The battery control units are located in hardware, on site, and are directly connected to the battery. The battery control units connect to the cloud to receive their desired control via the Azure Internet of Things (IoT) Hub. When the connection is lost, the control units provide backup control to the batteries to ensure they stay in a stable state.

Battery Interface

As mentioned above, each site will connect to this example system via the Azure IoT (Internet of Things) Hub. This will be done using the HTTPS or AMQPS protocols. The IoT hub allows us to scale the number of batteries to the millions, and will handle authentication and message routing. Each battery control unit will have a unique topic that it sends sensor data to, and a unique topic that other components can send control messages to back to the battery.

Utility Interface

Another process is responsible for reading/writing utility requests to the utility. This interface is likely a variant of SCADA, but may be adapted to suit the utility company.

CDI Agents

Each battery will have a process, known as a “CDI agent” to compute the optimal control for the battery. The agents will be implemented as service bus listeners in the Azure cloud. The agents communicate with other agents via the Azure Service Bus using AMQP. Each agent subscribes to the IoT topic from the battery it controls, and so can receive the current state of its battery.
To compute an approximate globally optimal control, the CDI agents communicate their estimation state and optimal control with each other (known as a “mean field”). To keep computation and message passing scalable, the CDI agents are clustered into a 2- or 3-level hierarchy, based on location and possibly battery type.
Clusters may, for example, be created for each substation level, and range from 100-1000 agents. Each cluster has a service bus topic to which all agents in the cluster publish and subscribe to. To share state between clusters, a particular node in each cluster is marked as a master. This node additionally subscribes and sends state to another topic shared by other master nodes.

Site Management

There is another set of nodes that are used for site management, that is, to be able to adapt the network of CDI agents as batteries are added or removed, or repair the network if failures occur.
The network structure is encoded by the list of agents and battery control units, as well as the list of which topics each agent should publish and subscribe to. This data is saved within the Azure Storage Service. Creating, destroying, and updating agents is done using the Azure Resource Management API.
When a new battery is installed, the system will receive new messages from the battery control unit for a request to add to the system; this request will include data about the battery (type, location, substation, etc.) that will be needed to find the appropriate CDI agent cluster to add to.
When a battery goes offline for a short time (e.g., due to loss of connection) then the IoT hub manages the last estimate of the state of the battery, and the battery goes into the backup local control until it reconnects. However, if a battery goes offline for a long time (e.g., because it is permanently disconnected) then the Site Management component then stops the associated CDI agent.
When clusters become too unbalanced, that is, they have too many or too few nodes, the site management component splits the cluster in two or combines nearby clusters into one, such as based on proximity or battery type. The site management component sends the affected agents the new lists of topics they subscribe and publish to. The picking of CDI master nodes is also done by the Site Management component.

Monitoring

The monitoring component listens for messages from each battery control unit and each CDI agent to make sure that

- messages are being sent as expected
- overall power is being produced within tolerances of the utility
- battery health is maintained for each battery at each site
  If a CDI agent is failing to control a battery, then the monitoring may alert a user over a dashboard, and also signal the site management component to restart that agent. If battery control units are failing to respond for a sufficient time, the monitoring may alert to have someone confirm that their battery is going offline, or have someone go out to the site if necessary. If power is not being produced to the requirements of the utility, or battery health is not maintained for some batteries, then the monitoring agent may send messages to the CDI agents to update their parameters to better meet demand or save battery life.

Updates/Maintenance

For updates to Battery Control code, the battery control unit is sent a first message to download the new code, and then a second message to stop the current processes and switch to the newly downloaded code. For updates to CDI Agent code, the site management component can stop running agents one at a time, and restart them with the new version of code. To deploy new code to the monitoring, site management, and utility listener, an updated version is started in parallel, and verified to work, then the old version is decommissioned.
As noted above, in some embodiments and situations, a general battery model may be trained to reflect attributes and characteristics specific to a particular battery in use, such as before controlling of the particular battery in use begins and/or during such controlling and use. Additional details regarding an example embodiment of training a model are as follows:
A parameter learning engine is described for adapting the parameters in the incremental model for the power battery cell to reflect a particular power battery cell.
The stochastic differential equation is:
dδx(t)=A(t)δx(t)dt+B(t)δu(t)dt+f(t)dt+dω(t) (1)
where the state δx(t)εR³, u(t)εR¹A(t) is a 3×3 matrix, B(t) is a 3×1 vector, ƒ(t) is a 3×1 vector and ω(t) is a 3×1 vector. The noise has zero mean and the covariance matrix W, e.g., ω(t)˜N(0, W).
The observation equation is given by
δy(t)=δx(t)+θ(t) (2)
where the measurement δy(t)εR³, and θ(t) is a 3×1 vector. The noise has zero mean and the covariance matrix V, e.g., θ(t)˜N(0, V).
The parameter learning engine estimates the A matrix, creating a vector of the nine values vect (α₁₁, α₁₂, α₁₃, α₂₁, α₂₂, α₂₃, α₃₁, α₃₂, α₃₃). These parameters are expected to change slower than the incremental state dynamic updates, so a discrete Kalman filter can be used to estimate the A parameters. For this parameter learning engine, the values of B(t), δu(t), ƒ(t), ω(t), δy(t), and θ(t) are known at times t_i, t_i−1and t_i−2.
Let ψ_t=vect(α₁₁, α₁₂, α₁₃, α₂₁, α₂₂, α₂₃, α₃₁, α₃₂, α₃₃)_t
and the dynamics of the parameters is, ψ_t+1=ψ_t+
(3)
where
_tis N(0,λ).
Solving (1) using the variation of constants formula, we get
δx(t _i+1)=e ^A(t ⁱ⁺¹ ^−t ⁱ ⁾ δx(t _i)+∫_t _i ^t ⁱ⁺¹ e ^A(t ⁱ⁺¹ ^−τ) B(τ)δu(τ)+f(τ)+ω(τ))dτ (4)
and using an impulsive approximation, yields,
δx(t _i+1)=e ^A(t ⁱ⁺¹ ^−t ⁱ ⁾ δx(t _i)+e ^A(t ⁱ⁺¹ ^−t ⁱ ⁾(B(t _i)δu(t _i)+f(t _i)+ω(t _i)) (5)
and applying a first order Taylor series expansion on the exponential term yields,
δx(t _i+1)=(I+A(t _i+1 −t ₁))(δx(t _i)+B(t _i)δu(t _i)+f(t _i)+ω(t _i)). (6)
Substituting δx(t_i)=δy(t_i)−θ(t_i) from (2) yields,
δY(t _i+1)=(I+A(t _i+1 −t _i))(δy(t _i)+B(t _i)δu(t _i)+f(t _i)+ω(t _i)−θ(t _i))+θ(t _i+1). (7)
To simplify the notation, let
γ(t _i)=δy(t _i)+B(t _i)δu(t _i)+f(t _i) (8)
then,
δy(t _i+1)=(I+A(t _i+1 −t _i))(γ(t _i)+ω(t _i)−θ(t _i))+θ(t _i+1)δy(t _i+1)=A(t _i+1 −t _i)γ(t _i)+(γ(t _i)+ω(t _i)−θ(t _i))+θ(t _i+1)+Ω(t _i+1)
where Ω(t_i+1) is a noise term. Then we have,
$\begin{matrix} δ y (t_{i + 1}) - γ (t_{i}) = A (t_{i + 1} - t_{i}) γ (t_{i}) + ω (t_{i}) - θ (t_{i}) + θ (t_{i + 1}) + Ω (t_{i + 1}) = A (t_{i + 1} - t_{i}) (γ (t_{i})) + Ξ (t_{i + 1}) & (9) \end{matrix}$
Let the sum of the independent noise vectors be Ξ(t_i+)=ω(t_i)−θ(t_i)+θ(t_i+1)+Ω(t_i+1) which has zero mean and because the noises are independent, the covariance matrix is the sum of the individual variances.
Now, to estimate the nine parameter values in A, which are constant over long intervals, we write (9) at times t_i, t_i−1and t_i−2,
δy=(t _i)−γ(t _i−1)=A(t _i −t _i−1)(γ(T _i−1))+Ξ(t _i)
δy=(t _i−1)−γ(t _i−2)=A(t _i −t _i−2)(γ(T _i−2))+Ξ(t _i−1)
δy=(t _i−2)−γ(t _i−3)=A(t _i −t _i−3)(γ(T _i−2))+Ξ(t _i−2)
and rewrite in matrix form for the parameter observation equation for the parameters, as,
$\begin{matrix} [\begin{matrix} δ y_{1} (t_{i}) - γ_{1} (t_{i - 1}) \\ δ y_{2} (t_{i}) - γ_{2} (t_{i - 1}) \\ δ y_{3} (t_{i}) - γ_{3} (t_{i - 1}) \\ δ y_{1} (t_{i - 1}) - γ_{1} (t_{i - 2}) \\ δ y_{2} (t_{i - 1}) - γ_{2} (t_{i - 2}) \\ δ y_{3} (t_{i - 1}) - γ_{3} (t_{i - 2}) \\ δ y_{1} (t_{i - 2}) - γ_{1} (t_{i - 3}) \\ δ y_{2} (t_{i - 2}) - γ_{2} (t_{i - 3}) \\ δ y_{3} (t_{i - 2}) - γ_{3} (t_{i - 3}) \end{matrix}] = (t_{i}) [\begin{matrix} a_{11} \\ a_{12} \\ a_{13} \\ a_{21} \\ a_{22} \\ a_{23} \\ a_{31} \\ a_{32} \\ a_{33} \end{matrix}] + \tilde{Ξ} (t_{I}) & (10) \end{matrix}$
where {tilde over (Ξ)}(t_i) is a 9×1 Gaussian vector with zero mean and a diagonal covariance matrix, with diagonal entries equal to the diagonal elements of W+2V+Cov(Ω).
Now, we have,
$(11)$ $(t_{i}) = [\begin{matrix} 11 (t_{i}) & 12 (t_{i}) & 13 (t_{i}) \\ 21 (t_{i}) & 22 (t_{i}) & 23 (t_{i}) \\ 31 (t_{i}) & 32 (t_{i}) & 33 (t_{i}) \end{matrix}]$ $where$ $11 (t_{i}) = [\begin{matrix} γ_{1} (t_{i - 1}) (t_{i} - t_{i - 1}) & γ_{2} (t_{i - 1}) (t_{i} - t_{i - 1}) & γ_{3} (t_{i - 1}) (t_{i} - t_{i - 1}) \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}] 12 (t_{i}) = [\begin{matrix} 0 & 0 & 0 \\ γ_{1} (t_{i - 1}) (t_{i} - t_{i - 1}) & γ_{2} (t_{i - 1}) (t_{i} - t_{i - 1}) & γ_{3} (t_{i - 1}) (t_{i} - t_{i - 1}) \\ 0 & 0 & 0 \end{matrix}] 13 (t_{i}) = [\begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ γ_{1} (t_{i - 1}) (t_{i} - t_{i - 1}) & γ_{2} (t_{i - 1}) (t_{i} - t_{i - 1}) & γ_{3} (t_{i - 1}) (t_{i} - t_{i - 1}) \end{matrix}] and 21 (t_{i}) = [\begin{matrix} γ_{1} (t_{i - 2}) (t_{i - 1} - t_{i - 2}) & γ_{2} (t_{i - 2}) (t_{i - 1} - t_{i - 2}) & γ_{3} (t_{i - 2}) (t_{i - 1} - t_{i - 2}) \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}] 22 (t_{i}) = [\begin{matrix} 0 & 0 & 0 \\ γ_{1} (t_{i - 2}) (t_{i - 1} - t_{i - 2}) & γ_{2} (t_{i - 2}) (t_{i - 1} - t_{i - 2}) & γ_{3} (t_{i - 2}) (t_{i - 1} - t_{i - 2}) \\ 0 & 0 & 0 \end{matrix}] 23 (t_{i}) = [\begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ γ_{1} (t_{i - 2}) (t_{i - 1} - t_{i - 2}) & γ_{2} (t_{i - 2}) (t_{i - 1} - t_{i - 2}) & γ_{3} (t_{i - 2}) (t_{i - 1} - t_{i - 2}) \end{matrix}] and 31 (t_{i}) = [\begin{matrix} γ_{1} (t_{i - 3}) (t_{i - 2} - t_{i - 3}) & γ_{2} (t_{i - 3}) (t_{i - 2} - t_{i - 3}) & γ_{3} (t_{i - 3}) (t_{i - 2} - t_{i - 3}) \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}] 32 (t_{i}) = [\begin{matrix} 0 & 0 & 0 \\ γ_{1} (t_{i - 3}) (t_{i - 2} - t_{i - 3}) & γ_{2} (t_{i - 3}) (t_{i - 2} - t_{i - 3}) & γ_{3} (t_{i - 3}) (t_{i - 2} - t_{i - 3}) \\ 0 & 0 & 0 \end{matrix}] 33 (t_{i}) = [\begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ γ_{1} (t_{i - 3}) (t_{i - 2} - t_{i - 3}) & γ_{2} (t_{i - 3}) (t_{i - 2} - t_{i - 3}) & γ_{3} (t_{i - 3}) (t_{i - 2} - t_{i - 3}) \end{matrix}]$
Now we use a discrete Kalman filter to estimate the parameters
$ψ_{i} = {[\begin{matrix} a_{11} \\ a_{12} \\ a_{13} \\ a_{21} \\ a_{22} \\ a_{23} \\ a_{31} \\ a_{32} \\ a_{33} \end{matrix}]}_{t_{i}} .$
As shown in (3) and (10), the parameter dynamics equation and the parameter observation equation are
ψ_i+1=ψ_i+
_i
Z _i =G _iψ_i+{tilde over (Ξ)}_i
where
is N(0,λ), {tilde over (Ξ)}_iis N(0, W+2V+Cov(Ω)), G_i=g(t_i) as shown in (11), and the observation
$Z_{i} = [\begin{matrix} δ y_{1} (t_{i}) - γ_{1} (t_{i - 1}) \\ δ y_{2} (t_{i}) - γ_{2} (t_{i - 1}) \\ δ y_{3} (t_{i}) - γ_{3} (t_{i - 1}) \\ δ y_{1} (t_{i - 1}) - γ_{1} (t_{i - 2}) \\ δ y_{2} (t_{i - 1}) - γ_{2} (t_{i - 2}) \\ δ y_{3} (t_{i - 1}) - γ_{3} (t_{i - 2}) \\ δ y_{1} (t_{i - 2}) - γ_{1} (t_{i - 3}) \\ δ y_{2} (t_{i - 2}) - γ_{2} (t_{i - 3}) \\ δ y_{3} (t_{i - 2}) - γ_{3} (t_{i - 3}) \end{matrix}]$
as shown in (10).
The Discrete Kalman filter equations follow.
The predictor equation of the state estimate is
ψ_i+1|i=ψ_i|i (12)
and the corrector equation with measurement is
ψ_i+1|i=ψ_i+1|i +K _i ^gain(Z _i−ψi+1|i) (13)
where K_i ^gainis the gain matrix.
The covariance matrix for the Kalman filter, denoted as Σ, is computed by the predictor equation,
Σ_i+1|i=Σ_i|i+λ (14)
and the corrector equation,
Σ_i+1|i+1 =[I−K _i ^gain]Σ_i+1|i. (15)
The gain is
K _i ^gain=Σ_i+1|i(Σ_i+1|i +W+2V+Cov(Ω))⁻¹. (16)
The initial condition is
Σ_0|0=0₀. (17)
FIG. 49 illustrates an architecture 4900 of the Parameter Learning Engine, with the Parameter Adaptive Engine (PAE) being a central component. The parameters of the model change slower than the state of the system.
As noted above, in some embodiments and situations, the internal state of an operating battery may be estimated, such as to estimate an internal temperature of the battery. Rather than managing battery temperature in the manner discussed herein, it is noted that prior systems, if they use temperature at all, typically merely shut down a battery's operation if the external temperature is too high, rather than control the battery's operations to manage the temperature. In addition, while not described in the example temperature model below, in some embodiments additional operations may be performed to control the internal temperature of the battery based at least in part on managing an external temperature surrounding the battery, such as by controlling the external temperature based at least in part on an estimated internal state of the battery.
Additional details regarding an example embodiment of using a battery temperature model to estimate such internal battery temperature are as follows:

- The incremental dynamic behavior of the battery temperature (running at a higher speed) is given by a differential equation,

$\begin{matrix} δ ? (t) = ? δ T (t - τ) + v (t) ? indicates text missing or illegible when filed & (1) \end{matrix}$

- where τ is a time delay representing diffusion effect, and v(t) is a linear function of the incremental state vector δr(t) (power, voltage, current) and the incremental control δu(t) for the lower speed control tracker

$\begin{matrix} v (t) = b \cdot δ z (t) + c \cdot δ u (t) + v (t) . & (2) \end{matrix}$

- The parameters a,b,c and τ can be estimated by a non-linear least squares estimator using the historical data, and v(t) is a Gaussian noise.
- The Laplace transform of (1) is:

$\begin{matrix} ? δ T (s) = α ? δ T (s) + v (s) & (3) \\ (s - α ?) δ T (s) = v (s) ? indicates text missing or illegible when filed & (4) \end{matrix}$

- We approximate the term (
  ) in (4) with a Pade approximate [1] of the form

$\frac{P (?)}{Q (?)}, ? indicates text missing or illegible when filed$
where P and Q are polynomials in s, and the order o(P)>o(Q). The Laplace transform of (1) is approximated by
$\begin{matrix} \frac{P (a)}{Q (a)} δ T (a) = v (t) & (5) \end{matrix}$

- For a low frequency approximation, we select a Pade approximate of the form

$\frac{P (?)}{Q (?)}$ $? indicates text missing or illegible when filed$
with
$\begin{matrix} P (s) = ? + α_{1} s + ? & (6) \\ Q (s) = β_{1} s + ? ? indicates text missing or illegible when filed & (7) \end{matrix}$

- We define

$\begin{matrix} w (s) = P (s) δ T (s) & (8) \end{matrix}$

- Replacing P(s) in (8) with (6), yields

$\begin{matrix} ? δ T (s) + α_{1} ? δ T (s) + ? δ T (s) = w (s) ? indicates text missing or illegible when filed & (9) \end{matrix}$

- The inverse Laplace transform of (9) is

$\begin{matrix} δ ? (t) + α_{1} δ ? (t) + ? δ T (t) = w (t) ? indicates text missing or illegible when filed & (10) \end{matrix}$
Define
, we have
$\begin{matrix} ? = ? = ? - ? + ? = ? - ? + ? [\begin{matrix} ? \\ ? \end{matrix}] = [?] [?] + [\begin{matrix} 0 \\ 1 \end{matrix}] v (t) ? indicates text missing or illegible when filed & (11) \end{matrix}$
From (5) and (8), we also have
$\begin{matrix} w (s) = Q (s) ? (s) ? indicates text missing or illegible when filed & (12) \end{matrix}$
Replacing Q(s) in (12) with (7), yields
$\begin{matrix} w (s) = β_{1} ? (s) + ? (s) ? indicates text missing or illegible when filed & (13) \end{matrix}$
The inverse Laplace transform of (13) is:
$\begin{matrix} w (t) = β_{1} ? (t) + ? (t) ? indicates text missing or illegible when filed & (14) \end{matrix}$
Replacing v(t) in (14) with (2), yields
$\begin{matrix} w (t) = β_{1} b \cdot ? (t) + β_{1} c \cdot ? (t) + ? \cdot ? (t) + ? c \cdot ? (t) + v_{1} (t) ? indicates text missing or illegible when filed & (15) \end{matrix}$
where
is a Gaussian noise.
In (15), we set the values of
to be constant during a unit time interval of the lower speed controller, in:
$\begin{matrix} ? (t) = ? (?), ? (t) = ? (?), ? (t) = ? (?) ? indicates text missing or illegible when filed & (16) \end{matrix}$
and let
to be the control variable for the light speed controller, yielding
$\begin{matrix} [?] = [?] [?] + [?] ? + [?] ? indicates text missing or illegible when filed & (17) \end{matrix}$
The state equation in (17) one be rewritten as:
$\begin{matrix} ? = ? + ? + f + Ω (c) with A = [?], B = [?] and f = [?] . ? indicates text missing or illegible when filed & (18) \end{matrix}$
and Ω(t) is a Gaussian noise.
The optimal control tracking problems for the high speed controller is summarized as:
$\begin{matrix} ? E ? + ? dt s . t . & (19) \\ ? ? indicates text missing or illegible when filed & (20) \end{matrix}$
By the separation principle, we can get the incremental control for the high speed controller.
$\begin{matrix} ? (t) = K ? (t) + ? ? indicates text missing or illegible when filed & (21) \end{matrix}$
where
is the state estimate by running a discrete Kalman filer, K and Φ are obtained by solving the deterministic trading problem.

- In the end, we take the linear combination of the two incremental controls
  to get the incremental control δu(t),

$\begin{matrix} δ u (t) = ? ? (t) + ? ? (t) ? indicates text missing or illegible when filed & (22) \end{matrix}$

- where
  =1 and 0≦
  ≦1.
- Notes: the (2,1) pade approximation for

$? = ?$ $? indicates text missing or illegible when filed$
then the (2,1) pade approximation for
is
$\begin{matrix} \frac{P (a)}{Q (a)} = ? \frac{? + 4 ?}{- 2 (?) + 6} = \frac{(?) ? + (?) ? - 6 a}{? + ?} = \frac{? + ? - ?}{? + ?} ? indicates text missing or illegible when filed & (23) \end{matrix}$

- From (6) and (7), we have

$\begin{matrix} ? = ?, ? = ?, ? = ?, ? = ? ? indicates text missing or illegible when filed & (24) \end{matrix}$

- In general, if P and Q are on an order of m and n respectively, (11) and (14) can be generalized as follows

$\begin{matrix} [\begin{matrix} ? \\ ? \\ ⋮ \\ ? \\ ? \end{matrix}] = [\begin{matrix} 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & \dots & 1 \\ ? & ? & ? & \dots & ? \end{matrix}] [\begin{matrix} ? \\ ? \\ ⋮ \\ ? \\ ? \end{matrix}] + [\begin{matrix} ? \\ ? \\ ⋮ \\ ? \\ 1 \end{matrix}] ? & (25) \\ ? = ? (t) + ? + \dots + ? ? indicates text missing or illegible when filed & (26) \end{matrix}$
FIG. 51 is a block diagram illustrating an example architecture 5100 for a battery control system. The example architecture includes, for example, the following components:

- a Boost/Buck Converter, such as a DC-to-DC converter used to match the average pulsed output to the commanded power from the battery controller;
- an Inverter/Rectifier, such as with the power inverter converting DC to AC when battery is discharging, and with the rectifier converting AC to DC when battery is charging;
- a A/D converter, such as an analog-to-digital converter that converts a continuous physical quantity to a digital number that represents the quantity's amplitude;
- a D/A converter, such as a digital-to-analog converter that converts digital data (usually binary) into an analog signal (current, voltage, or electric charge);
- a Battery Controller, such as an incremental feedback hybrid tracking control system (e.g., that includes a state estimator, a parameter learning engine, and a state and state rate tracker controller, as discussed in greater detail elsewhere herein) that drives/controls the Boost/Buck converter so that the battery operates in near resonance with a desired power signal generated by a CDI agent;
- a CDI Agent, such as to infer battery parameters and to use longevity and performance rules to generate the tracking signal for the desired power signal; and
- a Demand Forecaster, such as to generate the desired output power/charge power forecast of the battery.

FIG. 46 is a block diagram illustrating example computing systems suitable for performing techniques for implementing automated control systems to control or otherwise manipulate at least some operations of specified physical systems or other target systems in configured manners, such as to control physical target systems having one or more batteries by using characteristics of each battery's state to perform automated control of DC power that is provided from the battery (e.g., in a real-time manner and to optimize long-term operation of the battery), such as in the matter discussed above with respect to FIGS. 30-45 and 47-51 and elsewhere herein. In particular, FIG. 46 illustrates a server computing system 4600 suitable for providing at least some functionality of a CDD system, although in other embodiments multiple computing systems may be used for the execution (e.g., to have distinct computing systems executing the CDD Decision Module Construction component for initial configuration and setup before run-time control occurs, and one or more copies of the CDD Control Action Determination component 4644 and/or the CDD Coordinated Control Managements component 4646 for the actual run-time control). FIG. 46 also illustrates various client computer systems 4650 that may be used by customers or other users of the CDD system 4640, as well as one or more target systems with batteries to be controlled (in this example, target system 1 4660 and target system 2 4670, which are accessible to the CDD system 4640 over one or more computer networks 4690).
The server computing system 4600 has components in the illustrated embodiment that include one or more hardware CPU (“central processing unit”) computer processors 4605, various I/O (“input/output”) hardware components 4610, storage 4620, and memory 4630. The illustrated I/O components include a display 4611, a network connection 4612, a computer-readable media drive 4613, and other I/O devices 4615 (e.g., a keyboard, a mouse, speakers, etc.). In addition, the illustrated client computer systems 4650 may each have components similar to those of server computing system 4600, including one or more CPUs 4651, I/O components 4652, storage 4654, and memory 4657, although some details are not illustrated for the computing systems 4650 for the sake of brevity. The target systems 4660 and 4670 may also each include one or more computing systems (not shown) having components that are similar to some or all of the components illustrated with respect to server computing system 4600, but such computing systems and components are not illustrated in this example for the sake of brevity.
The CDD system 4640 is executing in memory 4630 and includes components 4642-4646, and in some embodiments the system and/or components each includes various software instructions that when executed program one or more of the CPU processors 4605 to provide an embodiment of a CDD system as described elsewhere herein. The CDD system 4640 may interact with computing systems 4650 over the network 4690 (e.g., via the Internet and/or the World Wide Web, via a private cellular network, etc.), as well as the target systems 4660 and 4670 in this example. In this example embodiment, the CDD system includes functionality related to generating and deploying decision modules in configured manners for customers or other users, as discussed in greater detail elsewhere herein. The other computing systems 4650 may also be executing various software as part of interactions with the CDD system 4640 and/or its components. For example, client computing systems 4650 may be executing software in memory 4657 to interact with CDD system 4640 (e.g., as part of a Web browser, a specialized client-side application program, etc.), such as to interact with one or more interfaces (not shown) of the CDD system 4640 to configure and deploy automated control systems (e.g., stored automated control systems 4625 that were previously created by the CDD system 4640 for use in controlling one or more physical target systems with batteries) or other decision modules 4629, as well as to perform various other types of actions, as discussed in greater detail elsewhere. Various information related to the functionality of the CDD system 4640 may be stored in storage 4620, such as information 4621 related to users of the CDD system (e.g., account information), and information 4623 related to one or more target systems that have batteries to be controlled.
It will be appreciated that computing systems 4600 and 4650 and target systems 4660 and 4670 are merely illustrative and are not intended to limit the scope of the present invention. The computing systems may instead each include multiple interacting computing systems or devices, and the computing systems/nodes may be connected to other devices that are not illustrated, including through one or more networks such as the Internet, via the Web, or via private networks (e.g., mobile communication networks, etc.). More generally, a computing node or other computing system or device may comprise any combination of hardware that may interact and perform the described types of functionality, including without limitation desktop or other computers, database servers, network storage devices and other network devices, PDAs, cell phones, wireless phones, pagers, electronic organizers, Internet appliances, television-based systems (e.g., using set-top boxes and/or personal/digital video recorders), and various other consumer products that include appropriate communication capabilities. In addition, the functionality provided by the illustrated CDD system 4640 and its components may in some embodiments be distributed in additional components. Similarly, in some embodiments some of the functionality of the CDD system 4640 and/or CDD components 4642-4646 may not be provided and/or other additional functionality may be available.
It will also be appreciated that, while various items are illustrated as being stored in memory or on storage while being used, these items or portions of them may be transferred between memory and other storage devices for purposes of memory management and data integrity. Alternatively, in other embodiments some or all of the software modules and/or systems may execute in memory on another device and communicate with the illustrated computing systems via inter-computer communication. Thus, in some embodiments, some or all of the described techniques may be performed by hardware means that include one or more processors and/or memory and/or storage when configured by one or more software programs (e.g., by the CDD system 4640 and/or the CDD components 4642-4646) and/or data structures, such as by execution of software instructions of the one or more software programs and/or by storage of such software instructions and/or data structures. Furthermore, in some embodiments, some or all of the systems and/or components may be implemented or provided in other manners, such as by using means that are implemented at least partially or completely in firmware and/or hardware, including, but not limited to, one or more application-specific integrated circuits (ASICs), standard integrated circuits, controllers (e.g., by executing appropriate instructions, and including microcontrollers and/or embedded controllers), field-programmable gate arrays (FPGAs), complex programmable logic devices (CPLDs), etc. Some or all of the components, systems and data structures may also be stored (e.g., as software instructions or structured data) on a non-transitory computer-readable storage medium, such as a hard disk or flash drive or other non-volatile storage device, volatile or non-volatile memory (e.g., RAM), a network storage device, or a portable media article to be read by an appropriate drive (e.g., a DVD disk, a CD disk, an optical disk, etc.) or via an appropriate connection. The systems, components and data structures may also in some embodiments be transmitted as generated data signals (e.g., as part of a carrier wave or other analog or digital propagated signal) on a variety of computer-readable transmission mediums, including wireless-based and wired/cable-based mediums, and may take a variety of forms (e.g., as part of a single or multiplexed analog signal, or as multiple discrete digital packets or frames). Such computer program products may also take other forms in other embodiments. Accordingly, the present invention may be practiced with other computer system configurations.
For illustrative purposes, some additional details are included below regarding some embodiments in which specific types of operations are performed in specific manners, including with respect to particular types of target systems and for particular types of control activities determined in particular manners. These examples are provided for illustrative purposes and are simplified for the sake of brevity, and the inventive techniques may be used in a wide variety of other situations, including in other environments and with other types of automated control action determination techniques, some of which are discussed below.
FIG. 1 is a network diagram illustrating an example environment in which a system for performing cooperative distributed control of one or more target systems may be configured and initiated. In particular, an embodiment of a CDD system 140 is executing on one or more computing systems 190, including in the illustrated embodiment to operate in an online manner and provide a graphical user interface (GUI) (not shown) and/or other interfaces 119 to enable one or more remote users of client computing systems 110 to interact over one or more intervening computer networks 100 with the CDD system 140 to configure and create one or more decision modules to include as part of an automated control system to use with each of one or more target systems to be controlled.
In particular, target system 1 160 and target system 2 170 are example target systems illustrated in this example, although it will be appreciated that only one target system or numerous target systems may be available in particular embodiments and situations, and that each such target system may include a variety of mechanical, electronic, chemical, biological, and/or other types of components to implement operations of the target system in a manner specific to the target system. In this example, the one or more users (not shown) may interact with the CDD system 140 to generate an example automated control system 122 for target system 1, with the automated control system including multiple decision modules 124 in this example that will cooperatively interact to control portions of the target system 1 160 when later deployed and implemented. The process of the users interacting with the CDD system 140 to create the automated control system 122 may involve a variety of interactions over time, including in some cases independent actions of different groups of users, as discussed in greater detail elsewhere. In addition, as part of the process of creating and/or training or testing automated control system 122, it may perform one or more interactions with the target system 1 as illustrated, such as to obtain partial initial state information, although some or all training activities may in at least some embodiments include simulating effects of control actions in the target system 1 without actually implementing those control actions at that time.
After the automated control system 122 is created, the automated control system may be deployed and implemented to begin performing operations involving controlling the target system 1 160, such as by optionally executing the automated control system 122 on the one or more computing systems 190 of the CDD system 140, so as to interact over the computer networks 100 with the target system 1. In other embodiments and situations, the automated control system 122 may instead be deployed by executing local copies of some or all of the automated control system 122 (e.g., one or more of the multiple decision modules 124) in a manner local to the target system 1, as illustrated with respect to a deployed copy 121 of some or all of automated control system 1, such as on one or more computing systems (not shown) that are part of the target system 1.
In a similar manner to that discussed with respect to automated control system 122, one or more users (whether the same users, overlapping users, or completely unrelated users to those that were involved in creating the automated control system 122) may similarly interact over the computer network 100 with the CDD system 140 to create a separate automated control system 126 for use in controlling some or all of the target system 2 170. In this example, the automated control system 126 for target system 2 includes only a single decision module 128 that will perform all of the control actions for the automated control system 126. The automated control system 126 may similarly be deployed and implemented for target system 2 in a manner similar to that discussed with respect to automated control system 122, such as to execute locally on the one or more computing systems 190 and/or on one or more computing systems (not shown) that are part of the target system 2, although a deployed copy of automated control system 2 is not illustrated in this example. It will be further appreciated that the automated control systems 122 and/or 126 may further include other components and/or functionality that are separate from the particular decision modules 124 and 128, respectively, although such other components and/or functionality are not illustrated in FIG. 1.
The network 100 may, for example, be a publicly accessible network of linked networks, possibly operated by various distinct parties, such as the Internet, with the CDD system 140 available to any users or only certain users over the network 100. In other embodiments, the network 100 may be a private network, such as, for example, a corporate or university network that is wholly or partially inaccessible to non-privileged users. In still other embodiments, the network 100 may include one or more private networks with access to and/or from the Internet. Thus, while the CDD system 140 in the illustrated embodiment is implemented in an online manner to support various users over the one or more computer networks 100, in other embodiments a copy of the CDD system 140 may instead be implemented in other manners, such as to support a single user or a group of related users (e.g., a company or other organization), such as if the one or more computer networks 100 are instead an internal computer network of the company or other organization, and with such a copy of the CDD system optionally not being available to other users external to the company or other organizations. The online version of the CDD system 140 and/or local copy version of the CDD system 140 may in some embodiments and situations operate in a fee-based manner, such that the one or more users provide various fees to use various operations of the CDD system, such as to perform interactions to generate decision modules and corresponding automated control systems, and/or to deploy or implement such decision modules and corresponding automated control systems in various manners. In addition, the CDD system 140, each of its components (including component 142 and optional other components 117, such as one or more CDD Control Action Determination components and/or one or more CDD Coordinated Control Management components), each of the decision modules, and/or each of the automated control systems may include software instructions that execute on one or more computing systems (not shown) by one or more processors (not shown), such as to configure those processors and computing systems to operate as specialized machines with respect to performing their programmed functionality.
FIG. 2 is a network diagram illustrating an example environment in which a system for performing cooperative distributed control of target systems may be implemented, and in particular continues the examples discussed with respect to FIG. 1. In the example environment of FIG. 2, target system 1 160 is again illustrated, with the automated control system 122 now being deployed and implemented to use in actively controlling the target system 1 160. In the example of FIG. 2, the decision modules 124 are represented as individual decision modules 124 a, 124 b, etc., to 124 n, and may be executing locally to the target system 1 160 and/or in a remote manner over one or more intervening computer networks (not shown). In the illustrated example, each of the decision modules 124 includes a local copy of a CDD Control Action Determination component 144, such as with component 144 a supporting its local decision module 124 a, component 144 b supporting its local decision module 124 b, and component 144 n supporting its local decision module 124 n. Similarly, the actions of the various decision modules 124 are coordinated and synchronized in a peer-to-peer manner in the illustrated embodiment, with each of the decision modules 124 including a copy of a CDD Coordinated Control Management component 146 to perform such synchronization, with component 146 a supporting its local decision module 124 a, component 146 b supporting its local decision module 124 b, and component 146 n supporting its local decision module 124 n.
As the decision modules 124 and automated control system 122 execute, various interactions 175 between the decision modules 124 are performed, such as to share information about current models and other state of the decision modules to enable cooperation and coordination between various decision modules, such as for a particular decision module to operate in a partially synchronized consensus manner with respect to one or more other decision modules (and in some situations in a fully synchronized manner in which the consensus actions of all of the decision modules 124 converge). During operation of the decision modules 124 and automated control system 122, various state information 143 may be obtained by the automated control system 122 from the target system 160, such as initial state information and changing state information over time, and including outputs or other results in the target system 1 from control actions performed by the decision modules 124.
The target system 1 in this example includes various control elements 161 that the automated control system 122 may manipulate, and in this example each decision module 124 may have a separate group of one or more control elements 161 that it manipulates (such that decision module A 124 a performs interactions 169 a to perform control actions A 147 a on control elements A 161 a, decision module B 124 b performs interactions 169 b to perform control actions B 147 b on control elements B 161 b, and decision module N 124 n performs interactions 169 n to perform control actions N 147 n on control elements N 161 n). Such control actions affect the internal state 163 of other elements of the target system 1, including optionally to cause or influence one or more outputs 162. As operation of the target system 1 is ongoing, at least some of the internal state information 163 is provided to some or all of the decision modules to influence their ongoing control actions, with each of the decision modules 124 a-124 n possibly having a distinct set of state information 143 a-143 n, respectively, in this example.
As discussed in greater detail elsewhere, each decision module 124 may use such state information 143 and a local model 145 of the decision module for the target system to determine particular control actions 147 to next perform, such as for each of multiple time periods, although in other embodiments and situations, a particular automated control system may perform interactions with a particular target system for only one time period or only for some time periods. For example, the local CDD Control Action Determination component 144 for a decision module 124 may determine a near-optimal location solution for that decision module's local model 145, and with the local CDD Coordinated Control Management component 146 determining a synchronized consensus solution to reflect other of the decision modules 124, including to update the decision module's local model 145 based on such local and/or synchronized solutions that are determined. Thus, during execution of the automated control system 122, the automated control system performs various interactions with the target system 160, including to request state information, and to provide instructions to modify values of or otherwise manipulate control elements 161 of the target system 160. For example, for each of multiple time periods, decision module 124 a may perform one or more interactions 169 a with one or more control elements 161 a of the target system, while decision module 124 b may similarly perform one or more interactions 169 b with one or more separate control elements B 161 b, and decision module 124 n may perform one or more interactions 169 n with one or more control elements N 161 n of the target system 160. In other embodiments and situations, at least some control elements may not perform control actions during each time period.
While example target system 2 170 is not illustrated in FIG. 2, further details are illustrated for decision module 128 of automated control system 126 for reference purposes, although such a decision module 128 would not typically be implemented together with the decision modules 124 controlling target system 1. In particular, the deployed copy of automated control system 126 includes only the single executing decision module 128 in this example, although in other embodiments the automated control system 126 may include other components and functionality. In addition, since only a single decision module 128 is implemented for the automated control system 126, the decision module 128 includes a local CDD Control Action Determination component 244, but does not in the illustrated embodiment include any local CDD Coordinated Control Management component, since there are not other decision modules with which to synchronize and interact.
While not illustrated in FIGS. 1 and 2, the distributed nature of operations of automated control systems such as those of 122 allow partially decoupled operations of the various decision modules, include to allow modifications to the group of decision modules 124 to be modified over time while the automated control system 122 is in use, such as to add new decision modules 124 and/or to remove existing decision modules 124. In a similar manner, changes may be made to particular decision modules 124 and/or 128, such as to change rules or other restrictions specific to a particular decision module and/or to change goals specific to a particular decision module over time, with a new corresponding model being generated and deployed within such a decision module, including in some embodiments and situations while the corresponding automated control system continues control operations of a corresponding target system. In addition, while each automated control system is described as controlling a single target system in the examples of FIGS. 1 and 2, in other embodiments and situations, other configurations may be used, such as for a single automated control system to control multiple target systems (e.g., multiple inter-related target systems, multiple target systems of the same type, etc.), and/or multiple automated control systems may operate to control a single target system, such as by each operating independently to control different portions of that target control system. It will be appreciated that other configurations may similarly be used in other embodiments and situations.
FIG. 3 is a block diagram illustrating example computing systems suitable for performing techniques for implementing automated control systems to control or otherwise manipulate at least some operations of specified physical systems or other target systems in configured manners. In particular, FIG. 3 illustrates a server computing system 300 suitable for providing at least some functionality of a CDD system, although in other embodiments multiple computing systems may be used for the execution (e.g., to have distinct computing systems executing the CDD Decision Module Construction component for initial configuration and setup before run-time control occurs, and one or more copies of the CDD Control Action Determination component 344 and/or the CDD Coordinated Control Managements component 346 for the actual run-time control). FIG. 3 also illustrates various client computer systems 350 that may be used by customers or other users of the CDD system 340, as well as one or more target systems (in this example, target system 1 360 and target system 2 370, which are accessible to the CDD system 340 over one or more computer networks 390).
The server computing system 300 has components in the illustrated embodiment that include one or more hardware CPU (“central processing unit”) computer processors 305, various I/O (“input/output”) hardware components 310, storage 320, and memory 330. The illustrated I/O components include a display 311, a network connection 312, a computer-readable media drive 313, and other I/O devices 315 (e.g., a keyboard, a mouse, speakers, etc.). In addition, the illustrated client computer systems 350 may each have components similar to those of server computing system 300, including one or more CPUs 351, I/O components 352, storage 354, and memory 357, although some details are not illustrated for the computing systems 350 for the sake of brevity. The target systems 360 and 370 may also each include one or more computing systems (not shown) having components that are similar to some or all of the components illustrated with respect to server computing system 300, but such computing systems and components are not illustrated in this example for the sake of brevity.
The CDD system 340 is executing in memory 330 and includes components 342-346, and in some embodiments the system and/or components each includes various software instructions that when executed program one or more of the CPU processors 305 to provide an embodiment of a CDD system as described elsewhere herein. The CDD system 340 may interact with computing systems 350 over the network 390 (e.g., via the Internet and/or the World Wide Web, via a private cellular network, etc.), as well as the target systems 360 and 370 in this example. In this example embodiment, the CDD system includes functionality related to generating and deploying decision modules in configured manners for customers or other users, as discussed in greater detail elsewhere herein. The other computing systems 350 may also be executing various software as part of interactions with the CDD system 340 and/or its components. For example, client computing systems 350 may be executing software in memory 357 to interact with CDD system 340 (e.g., as part of a Web browser, a specialized client-side application program, etc.), such as to interact with one or more interfaces (not shown) of the CDD system 340 to configure and deploy automated control systems (e.g., stored automated control systems 325 that were previously created by the CDD system 340) or other decision modules 329, as well as to perform various other types of actions, as discussed in greater detail elsewhere. Various information related to the functionality of the CDD system 340 may be stored in storage 320, such as information 321 related to users of the CDD system (e.g., account information), and information 323 related to one or more target systems.
It will be appreciated that computing systems 300 and 350 and target systems 360 and 370 are merely illustrative and are not intended to limit the scope of the present invention. The computing systems may instead each include multiple interacting computing systems or devices, and the computing systems/nodes may be connected to other devices that are not illustrated, including through one or more networks such as the Internet, via the Web, or via private networks (e.g., mobile communication networks, etc.). More generally, a computing node or other computing system or device may comprise any combination of hardware that may interact and perform the described types of functionality, including without limitation desktop or other computers, database servers, network storage devices and other network devices, PDAs, cell phones, wireless phones, pagers, electronic organizers, Internet appliances, television-based systems (e.g., using set-top boxes and/or personal/digital video recorders), and various other consumer products that include appropriate communication capabilities. In addition, the functionality provided by the illustrated CDD system 340 and its components may in some embodiments be distributed in additional components. Similarly, in some embodiments some of the functionality of the CDD system 340 and/or CDD components 342-346 may not be provided and/or other additional functionality may be available.
It will also be appreciated that, while various items are illustrated as being stored in memory or on storage while being used, these items or portions of them may be transferred between memory and other storage devices for purposes of memory management and data integrity. Alternatively, in other embodiments some or all of the software modules and/or systems may execute in memory on another device and communicate with the illustrated computing systems via inter-computer communication. Thus, in some embodiments, some or all of the techniques may be performed by hardware means that include one or more processors and/or memory and/or storage when configured by one or more software programs (e.g., by the CDD system 340 and/or the CDD components 342-346) and/or data structures, such as by execution of software instructions of the one or more software programs and/or by storage of such software instructions and/or data structures. Furthermore, in some embodiments, some or all of the systems and/or components may be implemented or provided in other manners, such as by using means that are implemented at least partially or completely in firmware and/or hardware, including, but not limited to, one or more application-specific integrated circuits (ASICs), standard integrated circuits, controllers (e.g., by executing appropriate instructions, and including microcontrollers and/or embedded controllers), field-programmable gate arrays (FPGAs), complex programmable logic devices (CPLDs), etc. Some or all of the components, systems and data structures may also be stored (e.g., as software instructions or structured data) on a non-transitory computer-readable storage medium, such as a hard disk or flash drive or other non-volatile storage device, volatile or non-volatile memory (e.g., RAM), a network storage device, or a portable media article to be read by an appropriate drive (e.g., a DVD disk, a CD disk, an optical disk, etc.) or via an appropriate connection. The systems, components and data structures may also in some embodiments be transmitted as generated data signals (e.g., as part of a carrier wave or other analog or digital propagated signal) on a variety of computer-readable transmission mediums, including wireless-based and wired/cable-based mediums, and may take a variety of forms (e.g., as part of a single or multiplexed analog signal, or as multiple discrete digital packets or frames). Such computer program products may also take other forms in other embodiments. Accordingly, the present invention may be practiced with other computer system configurations.
FIG. 4 is a flow diagram of an example embodiment of a Collaborative Distributed Decision (CDD) system routine 400. The routine may, for example, be provided by execution of the CDD system 340 of FIG. 3 and/or the CDD system 140 of FIG. 1, such as to provide functionality to construct and implement automated control systems for specified target systems.
The illustrated embodiment of the routine begins at block 410, where information or instructions are received. If it is determined in block 420 that the information or instructions of block 410 include an indication to create or revise one or more decision modules for use as part of an automated control system for a particular target system, the routine continues to block 425 to initiate execution of a Decision Module Construction component, and in block 430 obtains and stores one or more resulting decision modules for the target system that are created in block 425. One example of a routine for such a Decision Module Construction component is discussed in greater detail with respect to FIGS. 5A-5B.
After block 430, or if it is instead determined in block 420 that the information or instructions received in block 410 are not to create or revise one or more decision modules, the routine continues to block 440 to determine whether the information or instructions received in block 410 indicate to deploy one or more created decision modules to control a specified target system, such as for one or more decision modules that are part of an automated control system for that target system. The one or more decision modules to deploy may have been created immediately prior with respect to block 425, such that the deployment occurs in a manner that is substantially simultaneous with the creation, or in other situations may include one or more decision modules that were created at a previous time and stored for later use. If it is determined to deploy one or more such decision modules for such a target system, the routine continues to block 450 to initiate the execution of those one or more decision modules for that target system, such as on one or more computing systems local to an environment of the target system, or instead on one or more remote computing systems that communicate with the target system over one or more intermediary computer networks (e.g., one or more computing systems under control of a provider of the CDD system).
After block 450, the routine continues to block 460 to determine whether to perform distributed management of multiple decision modules being deployed in a manner external to those decision modules, such as via one or more centralized Coordinated Control Management components. If so, the routine continues to block 465 to initiate execution of one or more such centralized CDD Coordinated Control Management components for use with those decision modules. After block 465, or if it is instead determined in block 460 to not perform such distributed management in an external manner (e.g., if only one decision module is executed, if multiple decision modules are executed but coordinate their operations in a distributed peer-to-peer manner, etc.), the routine continues to block 470 to optionally obtain and store information about the operations of the one or more decision modules and/or resulting activities that occur in the target system, such as for later analysis and/or reporting.
If it is instead determined in block 440 that the information or instructions received in block 410 are not to deploy one or more decision modules, the routine continues instead to block 485 to perform one or more other indicated operations if appropriate. For example, such other authorized operations may include obtaining results information about the operation of a target system in other manners (e.g., by monitoring outputs or other state information for the target system), analyzing results of operations of decision modules and/or activities of corresponding target systems, generating reports or otherwise providing information to users regarding such operations and/or activities, etc. In addition, in some embodiments the analysis of activities of a particular target system over time may allow patterns to be identified in operation of the target system, such as to allow a model of that target system to be modified accordingly (whether manually or in an automated learning manner) to reflect those patterns and to respond based on them. In addition, as discussed in greater detail elsewhere, distributed operation of multiple decision modules for an automated control system in a partially decoupled manner allows various changes to be made while the automated control system is in operation, such as to add one or more new decision modules, to remove one or more existing decision modules, to modify the operation of a particular decision module (e.g., by changing rules or other information describing the target system that is part of a model for the decision module), etc. In addition, the partially decoupled nature of multiple such decision modules in an automated control system allows one or more such decision modules to operate individually at times, such as if network communication issues or other problems prevent communication between multiple decision modules that would otherwise allow their individualized control actions to be coordinated in such situations, some or all such decision modules may continue to operate in an individualized manner, such as to provide useful ongoing control operations for a target system even if optimal or near-optimal solutions cannot be identified from coordination and synchronization between a group of multiple decision modules that collectively provide the automated control system for the target system.
After blocks 470 or 485, the routine continues to block 495 to determine whether to continue, such as until an explicit indication to terminate is received. If it is determined to continue, the routine returns to block 410, and otherwise continues to block 499 and ends.
FIGS. 5A-5B illustrate a flow diagram of an example embodiment of a CDD Decision Module Construction routine 500. The routine may, for example, be provided by execution of the component 342 of FIG. 3 and/or the component 142 of FIG. 1, such as to provide functionality to allow users to provide information describing a target system of interest, and to perform corresponding automated operations to construct one or more decision modules to use to control the target system in specified manners. While the illustrated embodiment of the routine interacts with users in particular manners, such as via a displayed GUI (graphical user interface), it will be appreciated that other embodiments of the routine may interact with users in other manners, such as via a defined API (application programming interface) that an executing program invokes on behalf of a user. In some embodiments, the routine may further be implemented as part of an integrated development environment or other software tool that is available for one or more users to use, such as by implementing an online interface that is available to a variety of remote users over a public network such as the Internet, while in other embodiments a copy of the CDD system and/or particular CDD components may be used to support a single organization or other group of one or more users, such as by being executed on computing systems under the control of the organization or group. In addition, the CDD Decision Module Construction component may in some embodiments and situations be separated into multiple sub-components, such as a rules editor component that users interact with to specify rules and other description information for a target system, and a rules compiler engine that processes the user-specified rules and other information to create one or more corresponding decision modules.
The illustrated embodiment of the routine 500 begins at block 510, where the routine provides or updates a displayed user interface to one or more users, such as via a request received at an online version of component that is implementing the routine, or instead based on the routine being executed by one or more such users on computing systems that they control. While various operations are shown in the illustrated embodiment of the routine as occurring in a serial manner for the purpose of illustration, it will be appreciated that user interactions with such a user interface may occur in an iterative manner and/or over multiple periods of time and/or user sessions, including to update a user interface previously displayed to a user in various manners (e.g., to reflect a user action, to reflect user feedback generated by operation of the routine or from another component, etc.), as discussed further below.
After block 510, the routine continues to block 520 to receive information from one or more such users describing a target system to be controlled, including information about a plurality of elements of the target system that include one or more manipulatable control elements and optionally one or more outputs that the control elements affect, information about rules that specify restrictions involving the elements, information about state information that will be available during controlling of the system (e.g., values of particular elements or other state variables), and one or more goals to achieve during the controlling of the target system. It will be appreciated that such information may be obtained over a period of time from one or more users, including in some embodiments for a first group of one or more users to supply some information related to a target system and for one or more other second groups of users to independently provide other information about the target system, such as to reflect different areas of expertise of the different users and/or different parts of the target system.
After block 520, the routine continues to block 525 to identify any errors that have been received in the user input, and to prompt the user(s) to correct those errors, such as by updating the display in a corresponding manner as discussed with respect to block 510. While the identification of such errors is illustrated as occurring after the receiving of the information in block 520, it will be appreciated that some or all such errors may instead be identified as the users are inputting information into the user interface, such as to identify syntax errors in rules or other information that the users specify. After block 525, the illustrated embodiment of the routine continues to block 530 to optionally decompose the information about the target system into multiple subsets that each correspond to a portion of the target system, such as with each subset having one or more different control elements that are manipulatable by the automated control system being created by the routine, and optionally have overlapping or completely distinct goals and/or sets of rules and other information describing the respective portions of the target system. As discussed in greater detail elsewhere, such decomposition, if performed, may in some situations be performed manually by the users indicating different subgroups of information that they enter, and/or in an automated manner by the routine based on an analysis of the information that has been specified (e.g., based on the size of rules and other descriptive information supplied for a target system, based on inter-relationships between different rules or goals or other information, etc.). In other embodiments, no such decomposition may be performed.
After block 530, the routine continues to block 535 to, for each subset of target system description information (or for all the received information if no such subsets are identified), convert that subset (or all the information) into a set of constraints that encapsulate the restrictions, goals, and other specified information for that subset (or for all the information). In block 540, the routine then identifies any errors that occur from the converting process, and if any are identified, may prompt the user to correct those errors, such as in a manner similar to that described with respect to blocks 525 and 510. While not illustrated in this example, the routine may in some situations in blocks 525 and/or 540 return to block 510 when such errors are identified, to display corresponding feedback to the user(s) and to allow the user(s) to make corrections and re-perform following operations such as those of blocks 520-540. The errors identified in the converting process in block 540 may include, for example, errors related to inconsistent restrictions, such as if the restrictions as a group are impossible to satisfy.
After block 540, the routine continues to block 545 to, for each set of constraints (or a single constraint set if no subsets were identified in block 530), apply one or more validation rules to the set of constraints to test overall effectiveness of the corresponding information that the constraints represent, and to prompt the one or more users to correct any errors that are identified in a manner similar to that with respect to blocks 525, 540 and 510. Such validation rules may test one or more of controllability, observability, stability, and goal completeness, as well as any user-added validation rules, as discussed in greater detail elsewhere. In block 550, the routine then converts each validated set of constraints to a set of coupled differential equations that model at least a portion of the target system to which the underlying information corresponds.
After block 550, the routine continues to block 553 to perform activities related to training a model for each set of coupled differential equations, including to determine one or more of a size of a training time window to use, size of multiple training time slices within the time window, and/or a type of training time slice within the time window. In some embodiments and situations, the determination of one or more such sizes or types of information is performed by using default or pre-specified information, while in other embodiments and situations the users may specify such information, or an automated determination of such information may be performed in one or more manners (e.g., by testing different sizes and evaluating results to find sizes with the best performance). Different types of time slices may include, for example, successions of time slices that overlap or do not overlap, such that the training for a second time slice may be dependent only on results of a first time slice (if they do not overlap) or instead may be based at least in part on updating information already determined for at least some of the first time slice (if they do overlap in part or in whole). After block 553, the routine continues to block 555 to, for each set of coupled differential equations representing a model, train the model for that set of coupled differential equations using partial initial state information for the target system, including to estimate values of variable that are not known and/or directly observable for the target system by simulating effects of performing control actions over the time window, such as for successive time slices throughout the time window, and to test the simulated performance of the trained model. Additional details related to training and testing are included elsewhere herein.
After block 555, the routine continues to block 560 to determine whether the training and testing was successful, and if not returns to block 510 to display corresponding feedback information to the users to allow them to correct errors that caused the lack of success. If it is instead determined in block 560 that the testing and training were successful, however, the routine continues instead to block 570 to generate an executable decision module for each trained and tested model that includes that model, as well as a local CCD Control Action Determination component that the decision module will use when executed to determine optimal or near-optimal control actions to perform for the target system based on the information included in the model, and in light of the one or more goals for that decision module. The generated executable decision module may in some embodiments and situations further include a local CCD Coordinated Control Management component to coordinate control actions of multiple decision modules that collectively will provide an automated control system for the target system, such as by synchronizing respective models of the various decision modules over time. After block 570, the routine continues to block 580 to provide the generated executable decision modules for use, including to optionally store them for later execution and/or deployment.
After block 580, the routine continues to block 595 to determine whether to continue, such as until an explicit indication to terminate is received. If it is determined to continue, the routine returns to block 510, and otherwise continues to block 599 and ends.
FIGS. 6A-6B illustrate a flow diagram of an example embodiment of a routine 600 corresponding to a generic representation of a decision module that is being executed. The routine may, for example, be provided by execution of a decision module 329 or as part of an automated control system 325 of FIG. 3 and/or a decision module 124 or 128 of FIG. 1 or 2, such as to provide functionality for controlling at least a portion of a target system in a manner specific to information and a model encoded for the decision module, including to reflect one or more goals to be achieved by the decision module during its controlling activities. As discussed in greater detail elsewhere, in some embodiments and situations, multiple decision modules may collectively and cooperatively act to control a particular target system, such as with each decision module controlling one or more distinct control elements for the target system or otherwise representing or interacting with a portion of the target system, while in other embodiments and situations a single decision module may act alone to control a target system. The routine 600 further reflects actions performed by a particular example decision module when it is deployed in controlling a portion of a target system, although execution of at least portions of a decision module may occur at other times, such as initially to train a model for the decision module before the decision module is deployed, as discussed in greater detail with respect to the CDD Decision Module Construction routine 500 of FIGS. 5A-5B.
The illustrated embodiment of the routine 600 begins at block 610, where an initial model for the decision module is determined that describes at least a portion of a target system to be controlled, one or more goals for the decision module to attempt to achieve related to control of the target system, and optionally initial state information for the target system. The routine continues to block 615 to perform one or more actions to train the initial model if needed, as discussed in greater detail with respect to blocks 553 and 555 of FIGS. 5A-5B—in some embodiments and situations, such training for block 615 is performed only if initial training is not done by the routine 500 of FIGS. 5A-5B, while in other embodiments and situations the training of block 615 is performed to capture information about a current state of the target system at a time that the decision module begins to execute (e.g., if not immediately deployed after initial creation and training) and/or to re-train the model at times as discussed in greater detail with respect to routine 700 of FIGS. 7A-7B as initiated by block 630.
After block 615, the routine continues to block 617 to determine a time period to use for performing each control action decision for the decision module, such as to reflect a rate at which control element modifications in the target system are needed and/or to reflect a rate at which new incoming state information is received that may alter future manipulations of the control elements. The routine then continues to block 620 to start the next time period, beginning with a first time period moving forward from the startup of the execution of the decision module. Blocks 620-680 are then performed in a loop for each such time period going forward until execution of the decision module is suspended or terminated, although in other embodiments a particular decision module may execute for only a single time period each time that it is executed.
In block 625, the routine optionally obtains state information for the time period, such as current state information that has been received for the target system or one or more related external sources since the last time period began, and/or by actively retrieving current values of one or more elements of the target system or corresponding variables as needed. In block 630, the routine then initiates execution of a local CCD Control Action Determination component of the decision module, with one example of such a routine discussed in greater detail with respect to routine 700 of FIGS. 7A-7B. In block 635, the results of the execution of the component in block 630 are received, including to either obtain an updated model for the decision module with a local solution for the current time period and decision module that includes one or more proposed control action determinations that the decision module may perform for the current time period, or to receive an indication that no local solution was found for the decision module in the allowed time for the execution of the component in block 630. It is then determined in block 640 whether a solution was found, and if so continues to block 642 to store the updated model for the decision module, and otherwise continues to block 643 to use the prior model for the decision module to determine one or more control action determinations that are proposed for the current time period based on a previous model (e.g., that does not reflect recent changes in state information and/or recent changes in activities of other decision modules, if any), as discussed in greater detail with respect to routine 700 of FIGS. 7A-7B.
After blocks 642 or 643, the routine continues to block 644 to determine if other decision modules are collectively controlling portions of the current target system, such as part of the same automated control system as the local decision module, and if so continues to block 645. Otherwise, the routine selects the local proposed control actions of the decision module as a final determined control action to perform, and continues to block 675 to implement those control actions for the current time period.
If there are other operating decision modules, the routine in block 645 determines if the local decision module includes a local copy of a CDD Coordinated Control Management (CCM) component for use in synchronizing the proposed control action determinations for the decision module's local solutions with activities of other decision modules that are collectively controlling the same target system. If so, the routine continues to block 647 to provide the one or more proposed control action determinations of the decision module and the corresponding current local model for the decision module to the local CDD CCM component, and otherwise continues to block 649 to provide the one or more proposed control action determinations for the decision module and the corresponding local model of the decision module to one or more centralized CDD CCM components.
After blocks 647 or 649, the routine continues to block 655 to obtain results of the actions of the CDD CCM component(s) in blocks 647 or 649, including to either obtain a further updated model resulting from synchronization of the local model for the current decision module with information from one or more other decision modules, such that the further updated model indicates one or more final control action determinations to perform for the time period for the current decision module, or an indication that no such synchronization was completed in the allowed time. The routine continues to block 660 to determine whether the synchronization was completed, and if so continues to block 665 to store the further updated model from the synchronization, and otherwise continues to block 670 to use the prior proposed control action determinations locally to the decision module as the final control action determinations for the time period.
After blocks 665 or 670, the routine continues to block 675 to implement the one or more final determined control actions for the decision module in the target system, such as by interacting with one or more effectuators in the target system that modify values or otherwise manipulate one or more control elements of the target system, or by otherwise providing input to the target system to cause such modifications or other manipulations to occur. In block 680, the routine optionally obtains information about the results in the target system of the control actions performed, and stores and/or provides information to the CDD system about such obtained results and/or about the activities of the decision module for the current time period.
After block 680, the routine continues to block 695 to determine whether to continue, such as until an indication to terminate or suspend is received (e.g., to reflect an end to current operation of the target system or an end of use of the decision module to control at least a portion of the target system). If it is determined to continue, the routine returns to block 620 to start the next time period, and otherwise continues to block 699 and ends.
FIGS. 7A-7B are a flow diagram of a example embodiment of a CDD Control Action Determination routine 700. The routine may, for example, be provided by execution of the component 344 of FIG. 3 and/or components 144 a-n or 244 of FIG. 2, such as to determine control actions for a decision module to propose and/or implement for a target system during a particular time period, including in some embodiments to perform an optimization to determine near-optimal actions (e.g., within a threshold of an optimal solution) to perform with respect to one or more goals if possible. While the illustrated embodiment of the routine is performed in a manner local to a particular decision module, such that some or all decision modules may each implement a local version of such a routine, in other embodiments the routine may be implemented in a centralized manner by one or more components with which one or more decision modules interact over one or more networks, such as with a particular decision module indicated to be used at a particular time rather than acting on behalf of the local decision module.
The illustrated embodiment of the routine 700 begins at block 703, where information or a request is received. The routine continues to block 705 to determine a type of the information or request, and to proceed accordingly. In particular, if a request is received in block 703 to attempt to determine a solution for a current time period given a current model of the local decision module, the routine continues to block 710 to begin to perform such activities, as discussed in greater detail with respect to block 710-790. If it is instead determined in block 705 that a request to relax one or more rules or other restrictions for the current model of the local decision module is received, such as discussed in greater detail with respect to blocks 760 and 765, the routine continues to block 765. If it is determined in block 705 that a request is received to repair one or more rules or other restrictions for the current model of the local decision module, such as discussed in greater detail with respect to blocks 775 and 780, the routine continues to block 780 to obtain user input to use during the rule repair process (e.g., to interact with a CDD Decision Module Construction component, or to instead interact with one or more users in another manner), such as to allow the current model for the local decision module to later be updated and replaced based on further resulting user actions, or if operation of the target system can be suspended, to optionally wait to further perform the routine 700 until such an updated model is received. If it is instead determined in block 705 that the information or request is of another type, the routine continues instead to block 708 to perform one or more other indicated operations as appropriate, and to then proceed to block 799. Such other indicated operations may include, for example, receiving information about current models and/or control actions proposed or performed by one or more other decision modules that are collectively controlling a target system with the local decision module (such as for use in synchronizing the model of the local decision module with such other decision modules by generating a consensus or converged shared model, as discussed in greater detail with respect to routine 800 of FIGS. 8A-8B), to receive updates to a model or underlying information for the model for use in ongoing operation of the routine 700 (e.g., from a CDD Decision Module Construction component, such as results from interactions performed in block 780), to receive current state information for the target system, such as for use as discussed in routine 600 of FIGS. 6A-6B, etc.
If it determined in block 705 that a request for a solution was received in block 703 for a current time period and based on a current model of the local decision module, the routine continues to block 710 to receive a current set of coupled differential equations that represent the current model for the local decision module of at least a portion of the target system, optionally along with additional state information for the target system for the current time. The routine then continues to block 715 to determine whether to train or re-train the model, such as if the routine is called for a first time upon initial execution of a corresponding decision module or if error measurements from ongoing operations indicate a need for re-training, as discussed in greater detail with respect to blocks 755, 770 and 730. If it is determined to train or re-train the model, the routine continues to block 720 to determine one or more of the size of a training time window, size of training time slices within the time window, and/or type of training time slices within the training time window, such as in a manner similar to that previously discussed with respect to block 553 of routine 500 of FIGS. 5A-5B. After block 720, the routine continues to block 725 to use partial initial state information for the target system to train the model, including to estimate values of state variables for the target system that are not known and/or directly observable, by simulating effects of performing control actions over the time window for each of the time slices, as discussed in greater detail with respect to block 555 of routine 500 of FIGS. 5A-5B.
After block 725, or if it is instead determined in block 715 not to train or re-train the model, the routine continues to block 730 to perform a piecewise linear analysis to attempt to determine a solution for the current model and any additional state information that was obtained in block 710, with the solution (if determined) including one or more proposed control action determinations for the local decision module to take for a current time period, as well as in some embodiments to use one or more model error gauges to make one or more error measurements with respect to the current model, as discussed in greater detail elsewhere. The routine then continues to block 735 to determine if the operations in block 730 determined a solution within a amount of time allowed for the operation of block 730 (e.g., a defined subset or fraction of the current time period), and if so continues to block 740 to update the current set of coupled differential equations and the resulting current model for the local decision module to reflect the solution, with the resulting updated information provided as an output of the routine 700.
If it is instead determined in block 735 that the operations in block 730 did not determine a solution, the routine continues to block 745 to determine if additional time is available within the current time period for further attempts to determine a solution, and if not continues to block 790 to provide output of the routine 700 indicating that no solution was determined for the current time period.
If additional time is available within the current time period, however, the routine continues to perform blocks 755-780 to perform one or more further attempts to identify the solution—it will be appreciated that one or more of the operations of blocks 755-780 may be repeatedly performed multiple times for a given time period if sufficient time is available to continue further solution determination attempts. In particular, the routine continues to block 755 if additional time is determined to be available in block 745, where it determines whether the measurements from one or more gauges indicate model error measurements that are over one or more thresholds indicating modifications to the model are needed, such as based on the model error measurements from the gauges discussed with respect to block 730. If not, the routine continues to block 760 to determine whether there are one or more rules or other restrictions in the current model that are available to be relaxed for the current time period (that have not previously attempted to be relaxed during the time period, if this is not the first pass through this portion of the routing for the current time period), and if so continues to block 765 to relax one or more such rules or other restrictions and to return to block 730 to re-attempt the piecewise linear analysis with the revised model based on those relaxed rules or other restrictions.
If it is instead determined in block 755 that the model error measurements from one or more of the gauges are sufficient to satisfy one or more corresponding thresholds, the routine continues instead to block 770 to determine whether to re-train the model based on one or more of the gauges indicating sufficient errors to do so, such as based on accumulated errors over one or more time periods of updates to the model. If so, the routine returns to block 720 to perform such re-training in blocks 720 and 725, and then continues to block 730 to re-attempt the piecewise linear analysis with the resulting re-trained model.
If it is instead determined in block 770 not to re-train the model (or if the model was re-trained already for the current time period and the resulting re-attempt in block 730 again failed to find a solution), the routine continues to block 775 to determine whether the model error measurements from one or more of the gauges indicate a subset of one or more rules or other restrictions in the model that potentially have errors that need to be repaired. If so, the routine continues to block 780 to provide information to one or more users via the CDD Decision Module Construction component, to allow the users to revise the rules or other restrictions as appropriate, although in other embodiments some or all such rule repair activities may instead be attempted or performed in an automated manner. After block 780, or if it is instead determined in block 775 not to repair any rules, the routine continues to block 790 to provide an indication that no solution was determined for the current time period. After blocks 740, 708, or 790, the routine continues to block 799 and ends. It will be appreciated that if the routine 700 was instead implemented as a centralized routine that supports one or more decision modules remote from the executing component for the routine, the routine 700 may instead return to block 703 to await further information or requests.
FIGS. 8A-8B are a flow diagram of an example embodiment of a CDD Coordinated Control Management routine 800. The routine may, for example, be provided by execution of the component 346 of FIG. 3 and/or the components 146 a-n of FIG. 2, such as to attempt to synchronize current models and their proposed control actions between multiple decision modules that are collectively controlling a target system. In the illustrated embodiment of the routine, the synchronization is performed in a pairwise manner between a particular local decision module's local current model and an intermediate shared model for that decision module that is based on information about the current state of one or more other decision modules, by using a Pareto game technique to determine a Pareto equilibrium if possible that is represented in a consensus shared model, although in other embodiments other types of synchronization methods may be used. In addition, in the illustrated embodiment, the routine 800 is performed in a local manner for a particular local decision module, such as by being included within that local decision module, although in other embodiments the routine 800 may be implemented in a centralized manner to support one or more decision modules that are remote from a computing system implementing the component for the routine and that communicate with those decision modules over one or more computer networks, such as with a particular decision module indicated to be used at a particular time rather than acting on behalf of the local decision module.
The illustrated embodiment of the routine 800 begins at block 805, where it waits to receive information or another indication. The routine continues to block 810 to determine if a consensus model or other updated information for another decision module has been received, such as from a copy of the routine 800 executing for that other decision module, and if so continues to block 815 to use the received information to update local intermediate shared model information for use with the local decision module on whose behalf the current copy of the routine 800 is executing, as discussed in greater detail with respect to block 830. If it is instead determined in block 810 that the information or request received in block 805 is not information related to one or more other decision modules, or after block 815, the routine continues to block 820 to determine whether to currently perform a synchronization for the current local model of the local decision module by using information about an intermediate shared model of the local decision module that includes information for one or more other decision modules, such as to do such synchronization each time that an update to the local decision module's model is received (e.g., based on operation of the routine 700 for a copy of the CDD Control Action Determination component local to that decision module) in block 805 and/or each time that information to update the local decision module's intermediate shared model is received in block 805 and used in block 815, or instead as explicitly indicated in block 805—if the synchronization is to currently be performed, the routine continues to block 825 and begins to perform blocks 820-880 related to such synchronization activities. Otherwise, the routine continues to block 885 to perform one or more other indicated operations as appropriate, such as to receive requests from the CDD system or other requestor for current information about operation of the routine 800 and/or to provide corresponding information to one or more entities (e.g., to reflect prior requests), etc.
If it is determined in block 820 that synchronization is to be currently performed, such as based on updated model-related information that is received in block 805, the routine continues to block 825 to obtain a current local model for the local decision module to use in the synchronizing, with the model including one or more proposed control actions to perform for a current time period based on a local solution for the local decision module. The routine then continues to block 830 to retrieve information for an intermediate shared model of the local decision module that represents information for one or more other decision modules (e.g., all other decision modules) that are collectively participating in controlling the target system, with that intermediate shared model similarly representing one or more other proposed control actions resulting from local solutions of those one or more other decision modules, optionally after partial or complete synchronization has been performed for those one or more other decision modules between themselves.
The routine then continues to block 835 to attempt to determine a consensus shared model that synchronizes the current model of the local decision module and the intermediate shared model by simultaneously providing solutions to both the local decision module's current model and the intermediate shared model. In some embodiments, the operations of block 835 are performed in a manner similar to that discussed with respect to blocks 710-730 of routine 700 of FIG. 7A-7B, such as if the local model and the intermediate shared model are combined to create a combination model for whom one or more solutions are to be identified. As discussed in greater detail elsewhere, in some embodiments, the local current model and intermediate shared model may each be represented by a Hamiltonian function to enable a straightforward creation of such a combined model in an additive manner for the respective Hamiltonian functions, with the operations of routines 600 and/or 700 of FIGS. 6A-6B and 7A-7B, respectively, similarly representing the models that they update and otherwise manipulate using such Hamiltonian functions.
After block 835, the routine continues to block 840 to determine whether the operations of block 835 succeeded in an allowed amount of time, such as a fraction or other portion of the current time period for which the synchronization is attempted to be performed, and if so the routine continues to block 845 to update both the local model and the intermediate shared model of the local decision module to reflect the consensus shared model. As earlier noted, if sufficient time is allowed for each decision module to repeatedly determine a consensus shared model with changing intermediate shared models representing one or more other decision modules of a collective group, the decision modules of the collective group may eventually converge on a single converged shared model, although in other embodiments and situations there may not be sufficient time for such convergence to occur, or other issues may prevent such convergence. After block 845, the routine continues to block 850 to optionally notify other decision modules of the consensus shared model determined for the local decision module (and/or of a converged shared model, if the operations of 835 were a last step in creating such a converged shared model), such as if each of the notified decision modules is implementing its own local version of the routine 800 and the provided information will be used as part of an intermediate shared model of those other decision modules that includes information from the current local decision module's newly constructed consensus shared model.
If it is instead determined in block 840 that a synchronization did not occur in the allowed time, the routine continues to perform blocks 860-875 to re-attempt the synchronization with one or more modifications, sometimes repeatedly if sufficient time is available, and in a manner similar to that discussed with respect to blocks 745-780 of routine 700 of FIGS. 7A-7B. In the illustrated example, the routine determines in block 860 if additional time is available for one or more such re-attempts at synchronization, and if not the routine continues to block 880 to provide an indication that no synchronization was performed within the allowed time. Otherwise, the routine continues to block 870 to take one or more actions to perform one or more of relaxing rules or other restrictions, repairing rules or other restrictions, and/or re-training a model, with respect to one or both of the current model of the local decision module and/or one or more other decision modules whose information is represented in the intermediate shared model of the local decision module. If it is determined in block 870 to proceed in this manner, the routine continues to block 875 to perform corresponding actions, sometimes one at a time in a manner similar to that discussed with respect to routine 700, including to cause resulting updates to the current model of the local decision module and/or to the local intermediate shared model of the local decision module, after which the routine returns to block 835 to re-attempt to synchronize the local model and the intermediate shared model of the local decision module.
If it is instead determined in block 870 that no further actions are to be performed with respect to relaxation, repair and/or re-training, the routine continues instead to block 880. After blocks 850, 880 or 885, the routine continues to block 895 to determine whether to continue, such as until an explicit indication to terminate or suspend operation of the routine 800 is received, such as to reflect an end to operation of the target system and/or an end to use of the local decision module and/or a collective group of multiple decision modules to control the target system. If it is determined to continue, the routine returns to block 805, and otherwise continues to block 899 and ends.
FIG. 9 illustrates a flow diagram of an example embodiment of a routine 900 performed for a representative generic target system, with respect to interactions between the target system and one or more decision modules that are controlling at least a portion of the target system. The routine may, for example, be provided by execution of a target system 360 and/or 370 of FIG. 3, and/or a target system 160 and/or 170 of FIGS. 1 and 2, such as to implement operations specific to the target system. It will be appreciated that the illustrated embodiment of the routine focuses on interactions of the target system with the one or more decision modules, and that many or all such target systems will perform many other operations in a manner specific to those target systems that are not illustrated here for the purpose of brevity.
The routine begins at block 910, where it optionally provides initial state information for the target system to a CDD system for use in an automated control system of the CDD system for the target system, such as in response to a request from the CDD system or its automated control system for the target system, or instead based on configuration specific to the target system (e.g., to be performed upon startup of the target system). After block 910, the routine continues to block 920 to receive one or more inputs from a collective group of one or more decision modules that implement the automated control system for the target system, including one or more modified values for or other manipulations of one or more control elements of a plurality of elements of the target system that are performed by one or more such decision modules of the automated control system. As discussed in greater detail elsewhere, the blocks 920, 930, 940 may be repeatedly performed for each of multiple time periods, which may vary greatly in time depending on the target system (e.g., a microsecond, a millisecond, a hundredth of a second, a tenth of a second, a second, 2 seconds, 5 seconds, 10 seconds, 15 seconds, 30 seconds, a minute, 5 minutes, 10 minutes, 15 minutes, 30 minutes, an hour, etc.).
After block 920, the routine continues to block 930 to perform one or more actions in the target system based on the inputs received, including to optionally produce one or more resulting outputs or other results within the target system based on the manipulations of the control elements. In block 940, the routine then optionally provides information about the outputs or other results within the target system and/or provides other current state information for the target system to the automated control system of the CDD system and/or to particular decision modules of the automated control system. The routine then continues to block 995 to determine whether to continue, such as until an explicit indication to terminate or suspend operation of the target system is received. If it is determined to continue, the routine returns to block 920 to begin a next set of control actions for a next time period, and otherwise continues to block 999 and ends. As discussed in greater detail elsewhere, state information that is provided to a particular decision module may include requests from external systems to the target system, which the automated control system and its decision modules may determine how to respond to in one or more manners.
The following sections describe a variety of specific, non-exclusive embodiments in which some or all of the techniques may be implemented. It will be appreciated that particular details of particular embodiments may not be included in or true for all embodiments, and that the described embodiments may be implemented individually or in combination with any and all other combinations of other embodiments.
The following discusses several non-exclusive example embodiment, in which one or more specified embodiments of the Collaborative Distributed Decision system are each referred to as a Collaborative Distributed Inferencing (CDI) system or a Cooperative Distributed Inferencing Platform (CDIP), in which one or more specified embodiments of the Decision Module Construction component are each referred to as the “Rules Builder” or including or being performed by a “Rule Conversion Engine” (RCE) or a “CDL Compiler” or a “Rule(s) Entry Interface” (REI) or a “Rules and Goals UI”, in which one or more specified embodiments of the Control Action Determination component are each referred to as having “chattering” subcomponents or performing “chattering” functionality or including or being performed by a “Query Response Engine” (QRE) or an “Optimization Element” or via “Hamiltonian-Chattering Control”, in which one or more specified embodiments of the Coordinated Control Management component are each referred to as having or performing “mean field” information or functionality or including or being performed by a “Tellegen” agent or a “Pareto Multi-Criteria Optimization Engine” (PMOE) or a “Pareto element”, in which decision modules are referred to as “agents” or “peer agents” or “agent nodes” or “decision elements”, in which an automated control system may include a “cluster” of agents and in some cases is referred to as a “distributed architecture”, in which a target system is referred to as an “application domain”, in which a decision module's stored information about a target system includes an “Internal Heterogeneous Database” (IHDB) and/or a “Domain Rules Knowledge Base”, in which a decision module's consensus shared model is referred to as an “internal optimum” generated using mean field information, in which changes propagated from one decision module to others is referred to as a “delta”, etc.
CDI is built using a Peer-to-Peer (P2P) based distributed architecture that allows for partitioning the overall optimization problem into smaller sub-tasks or work-loads between peer agents. CDI Peer Agents are equally privileged participants in the application domain. They are configured to form a peer-to-peer network of nodes. This allows each agent in the network to solve the problem independently, using its own internal knowledge base. Each agent also internally engages in Pareto game playing to reach internal optimum before sharing changes with other agents. The agents then communicate the mean field changes with other agents in the network using a gossip protocol to synchronize their internal mean field approximation in an eventually consistent fashion. FIG. 10 illustrates a network diagram 1000 of a portion of a distributed architecture of an example CDI system.

CDI Cluster Setup

When system is initially configured certain agents are tagged as seed-nodes in the network. The seed agent nodes can be started in any order and it is not necessary to have all the seed agent nodes running. When initializing the CDI cluster at least one seed agent node is preferably running, otherwise the other CDI seed nodes may not become initialized and other CDI agent nodes might not join the cluster.

CDI Agent Cluster Registry

Each agent has a built-in registry of active CDI agent nodes in the cluster. This agent registry is started on all CDI agent nodes and is updated with every change in cluster membership including new agents joining the cluster, agents leaving the cluster, agent timeouts etc. Agents inform each other of active membership through a heartbeat mechanism. The registry is eventually consistent, i.e. changes may not immediately be visible at other nodes, but typically they will be fully replicated to all other nodes after a few seconds. The changed are propagated using the change deltas and are disseminated in a scalable way to other nodes with a gossip protocol.

CDI Cluster Agent Failure Detection

The CDI agent nodes in the cluster monitor each other by sending heartbeats to detect if a CDI agent node is unreachable from the rest of the CDI cluster. The heartbeat arrival times is interpreted by an implementation of the Phi Accrual Failure Detector (Hayashibara, D{tilde over (e)}fago, Yared, & Katayama, 2004).
The suspicion level of failure is given by a value called phi. The basic idea of the phi failure detector is to express the value of phi on a scale that is dynamically adjusted to reflect current network conditions.
The value of phi is calculated as:
Phi=−log 10(1−F(timeSinceLastHeartbeat))
F is the cumulative distribution function of a normal distribution with mean and standard deviation estimated from historical heartbeat inter-arrival times.

CDI Cluster Agent Network Partition

Once a CDI agent node becomes network partitioned it will stop receiving mean field updates from the other agents in the Cluster. This however, should not prevent it from continuing to perform local optimizations based on its internal knowledge base, local mean field and sensor inputs that it will continue to receive.
The Rules Builder contains the following components:

- Rules Entry Interface
- Validator
- CDL Compiler
- Step Processor
- Workflow Remote Control
- Compiled Rules Engine Artifact
  And interacts directly with the following components:
- Master Agent.

The Master Agent is responsible for interaction with all Chattering components, including the following:

- System Trainer & Bootstrapper
- Runner
  The System also utilizes a Persistent Store to access state and necessary models/data. FIG. 11 illustrates a network diagram 1100 of a portion of an example Rules Builder component.

Rules Entry Interface:

The Rules Entry Interface is responsible for receiving rules in a syntax familiar to a domain expert. The entry interface is a text entry tool where a domain expert would insert the rules, goal and system information (rules script(s)) to provide continuous control suggestions for a given problem. The entered rules script(s) would be composed in the CDI Domain-Specific Language (DSL). The CDI DSL facilitates functional translation of the entered rules script(s) into a problem definition which houses the control definitions (measurable variable with ranges allowed and any associated constraints on the control variable).
For example:
In the entry interface using the CDI DSL one would specify an upper and lower bound using the following syntax first defining a rule, with the parameters and the rule logic that evaluates against the system state.
rule id: “max production”, params: [“production”], {it<=15.0}
rule id: “min production”, params: [“production”], {it>=−50.0}
control var: “production”, min: “min production”, max: “max production”
In the entry interface using the CDI DSL one would specify the dynamics (variables that change with relation to other measurable variables and the definition of this change)
delta var: “inventory”, params: [“production”, “demand”], {p, d−>p−d}
In the entry interface using the CDI DSL one would specify the goal (an objective i.e., to minimize or maximize a relationship between the control and controllable dynamics).
goal objective: “minimize”, params: [“inventory”, “production”], {inventory, production−>((inventory−10)*(inventory−10))+(production*production)}
Finally, the entry interface is also responsible for facilitating a domain expert to provide certain settings that are used by the system as well as providing user workflow steps. The settings provided represent key: value terms used by the system and are available throughout by the Settings Provider encapsulated within the Rules Engine. An domain expert might write initial values of the following form:
settings [initialState: “20”, initialCoState: “0”, terminalCoState: “0”, numChatteringLevels: “9”, horizon: “3”, delta: “0.1”, iterations: “15”]

CDL Compiler:

The CDL Compiler is responsible for the conversion of said rules into evaluatable constraints, system information and an optimization goal. The CDI Compiler translates the entered script(s) into a problem definition. A problem definition is composed of labeled rules. A rule is a tuple comprised of a unique name and a Term. A Term is an evaluatable function where the logic was authored as previously described above, the input to which contains a representation of the system state (StateMap).

Validator:

The Validator is responsible for the validation of converted constraints with regards to restrictions the optimization problem needs to satisfy as a prerequisite to its solution and/or reaching a solution quality threshold.
The converted constraints defined in the problem definition, along with the settings provider, is validated for controllability, observability, stability and goal completeness.
controllability—this check ensures that every control variable has a defined rule relating it to one or more dynamic state variables. The above example satisfies this check since our single control variable ‘production’ is used in the dynamic definition of ‘inventory’;
observability—checks the statemap against the defined rules in the script(s). An observability check does not impede the solution of a problem, however if failing, represents that our problem may not be well defined;
stability—this check ensures that our system will converge over time on a solution, which can be tested in the testing stage; and
goal completeness—ensures that every control variable appears in the goal.

Compiled Rules Engine:

The conversion of rule scripts into mathematical expressions that can be yielded through an explicit contract to the chattering agent for use in solving the optimization goal, manifests in the compiled artifact that we label the Compiled Rules Engine
The Rules Engine defines and implements several interfaces for acquiring the values of these mathematical expressions, as later described with respect to “chattering” components. The Rules Engine is compiled as an artifact available to the Chattering component by the Compiler and made available for use in continuous processing. FIG. 12 illustrates a network diagram 1200 of example sub-components of an example Rules Builder component, in which compilation is dependent upon successful Validation of the observability, completeness, controllability, and workflow step validation. Any errors/warnings are returned to the domain expert for correction. Once corrections/improvements are complete, a new Compiled Rules Engine artifact is produced along with new commands delivered to the Step Processor.

Steps Processor and Remote Control:

Furthermore, the rules builder is also responsible for facilitating the workflow steps an individual domain expert would need to undertake before deploying such a system. These steps include:

- training an optimization system using the converted rules, with bootstrapped sensor data;
- testing the resulting control actuator inputs (the end results of the chattering optimization); and
- persisting the trained model for use in the running of the overall system.

The workflow component allows for training and testing a system. This is integrated into the DSL in the form of a workflow step builder.
For example,


workflow {

	train.onComplete {
	persist: model, ‘/tmp/model’

run.withSensorData(‘/tmp/simulatedData’).after(10 minutes)

{

	persist: results, ‘/tmp/results’
	shutdown

}

Here a domain expert instructs the default bootstrapper to be used in loading predefined sensor data, train the system on this data, persist the trained model output to the file ‘/tmp/model’, then once trained to begin running the system (as a single agent). The inputs are tested by analyzing the results from the run. The artifacts and model information are persisted to a persistent store. The Steps Processor facilitates the dispatch of interpreted workflow steps to the Remote Control for delegation to the Master Agent of a running system for processing.
FIG. 13 illustrates a network diagram 1300 of interactions of portions of a Rules Builder component with an executing agent, including the Steps processor, the Remote Control and interactions with a running agent (the Step Processor handles interpretation of workflow steps, and the steps are delivered for dispatch by the Remote Control, which passes commands to the Master Agent; the Master Agent handles communication between a running system's controlled elements and the Remote Control, and examples of the Controlled Elements are the Trainer and Runner).
FIG. 14 illustrates a network diagram 1400 of interactions of portions of a Rules Builder component with chattering components, including a Running Agent that delivers messages and interacts with the chattering components to facilitate training and running the system (the Trainer persists its model once training has completed; The Trainer invokes a Bootstrapper for acquisition of training data; The Trainer utilizes the Chattering Library; The Trainer utilizes the Chattering Library; Chattering utilizes the compiled rules; Chattering persists its control suggestions; Bootstrapper persists training data).
With respect to the discussion below, FIG. 15 illustrates a diagram 1500 of a sliding window for use with chattering, and is referred to in the text for the example embodiment as “FIG. 1”, while FIG. 16 illustrates a diagram 1600 of using Lebesgue integration to approximate a control trajectory for chattering, and is referred to in the text for the example embodiment as “FIG. 2”.

1. Parametric Chattering Control Over a Rolling Horizon

The real-time problem we would like to solve is really over an infinite time horizon in real-time, but our approach is to consider a window of fixed length, and then apply invariant imbedding as we increase the window length to achieve an optimal relaxed control. The chattering control is applied to the window of fixed length, and results in a relatively easily solved problem. Then the window slides to a point where, perhaps, a new measurement is available, or a new control action is applicable. And the algorithm iterates in this manner.
To develop our chattering control method, we first start with a classical, deterministic control problem of the form,
$\begin{matrix} \min_{u (t) \in i} \int_{t_{0}}^{T} ℒ (x (t), u (t)) dt + ψ (x (T)) s . t . x (t) = x (t_{0}) + \int_{t_{0}}^{t} f (x (τ), u (τ)) d τ or alternatively s . t . \dot{x} (t) = f (x (t), u (t)) for t_{0} \leq t \leq T & (1) \end{matrix}$
where x(t)ε
, u(t)εU⊂
ⁿ,
is twice continuously differentiable with respect to x and continuous with respect to u, ψ(x(T))):
ⁿ→
and is continuous and twice continuously differentiable with respect to x. Also, f(x(t), u(t)) is twice continuously differentiable with respect to x and u. The time horizon T is assumed to be finite and known. We also assume initial conditions x(t₀) are known.
The objective is to get the behavior at T based on information from the recent past. We develop discrete time windows with dynamics of recent past, update the state with sensory information, and construct an open loop feedback strategy.
We reformulate Problem (1) to a standard form. First we introduce a new variable x_n+1(t), with x_n+1(t₀)=0, and defined by
{dot over (x)} _n+1(t)=
(x(t),u(t)). (2)
Note that, for t₀≦t≦T,
x _n+1(t)=∫_t ₀ ^t
(x(τ),u(τ))dt.
Further, convert the terminal term in the criterion by defining a state variable y(t)ε
¹, with y(t₀)=0, and
$\begin{matrix} \dot{y} (t) = \frac{\partial ψ (x (t))}{\partial x (t)} f (x (t), u (t)) & (3) \end{matrix}$
for t₀≦t≦T. Note that y(T)=ψ(x(T)) for any trajectory since
$\dot{y} (t) = \frac{\partial ψ (x (t))}{\partial x (t)} \dot{x} (t) .$
The original control problem (1) is now of the form
$\min_{u (t) \in U} x_{n + 1} (T) + y (T)$ $s . t . x_{n + 1} (t) = \int_{t_{0}}^{t} ℒ (x (τ), u (τ)) d τ$ $y (t) = \int_{t_{0}}^{t} \frac{\partial ψ}{\partial x} f (x (τ), u (τ)) d τ$ $x (t) = x (t_{0}) + \int_{t_{0}}^{t} f (x (τ), u (τ)) d τ$
For ease of notation, we create an n+2 dimension state vector,
$z (t) = [\begin{matrix} x_{n + 1} (t) \\ x (t) \\ y (t) \end{matrix}]$
and denote the dynamics in integral form as
z(t)=z(t ₀)+∫t ₀ ^t F(z(τ),u(τ))dτ
or, if the dynamics are differentiable, as
$\dot{z} (t) = F (z (t), u (t)) = [\begin{matrix} {\dot{x}}_{n + 1} (t) \\ \dot{x} (t) \\ \dot{y} (t) \end{matrix}] = [\begin{matrix} ℒ (x (t), u (t)) \\ f (x (t), u (t)) \\ \frac{\partial ψ}{\partial x} f (x (t), u (t)) \end{matrix}] .$
Note that the state vector z(t) captures the original dynamics, criterion and terminal cost.
Without loss of generality, we now consider the following optimal control problem
$\begin{matrix} \min_{u (t) \in U} Qz (T) s . t . z (t) = z (t_{0}) + \int_{t_{0}}^{t} F (z (τ), u (τ)) d τ & (4) \end{matrix}$
where Q is a row vector of dimension n+2 with Q=[1 0_1×n1], so that Qz(T)=x_n+1(T)+y(T). We assume z(t₀) is known (since x(t₀) is assumed known, x_n+1(t₀)=0, and y(t₀)=0).
For the rolling horizon approach, we consider the initial problem on a window [t₀,γ], partitioned into N sufficiently small intervals of length Δ. We let Δ be the same for all intervals for ease of notation. Note that t₀+NΔ=γ. We get a sequence of windows as the window moves, and the kth window is [t₀+kΔ,γ+kΔ]. See FIG. 1 for an illustration of the sliding window.
We reformulate the control problem specified in (4) for the kth window, as follows,
$\begin{matrix} \min_{u (t) \in U} Qz (γ + k Δ) s . t . z (t) = z (t_{0} + k Δ) + \int_{t_{0} + k Δ}^{t} F (z (τ), u (τ)) d τ for t_{0} + k Δ \leq t \leq γ + k Δ . & (5) \end{matrix}$
We let k=0; 1; . . . , and consider the end of each window as corresponding to real-time.
Our approach is to chatter on the control, in a similar manner as in Kohn et al. (2010). In this paper, we use a probability distribution to derive a chattering control in a probability space, and find an approximation to the relaxed control problem.
The relaxed form of the control problem for the kth window specified in (5) is
$\begin{matrix} \min_{0 \leq α (t) \leq 1} Qz (γ + k Δ) s . t . z (t) = z (t_{0} + k Δ) + \int_{t_{0} + k Δ}^{t} \int_{Ω} F (z (τ), u (τ) c) d α d τ \int_{Ω}^{} d α = 1 & (6) \end{matrix}$
where c is the control distribution defined by constructing the control probability distribution α. The resulting α* provides the optimal distribution that solves the problem in (5).
Problem (6) can be viewed as a relaxed optimal control problem with respect to the Young measure.
Note that Δ and γ should be sufficiently small, so that F remains continuous and differentiable in z, and measurable in the control.
The idea underlying the chattering control is to approximate the control trajectory using Lebesgue integration instead of the more traditional Riemann integration. See FIG. 2.
The chattering approximation to the relaxed problem (6) for the kth window is
$\begin{matrix} \min_{0 \leq α_{i} (t) \leq 1, \forall i and t} Qz (γ + k Δ) s . t . z (t + Δ) = z (t) + Δ \sum_{i = 1}^{I} F (z (τ), c_{i}) α_{i} (t) \sum_{i = 1}^{I} α_{i} (t) = 1 for t = t_{0} + k Δ, t_{0} + k Δ + Δ, \dots, γ + k Δ - Δ & (7) \end{matrix}$
where c_ifor i=1; . . . ; I is the i^thquantization level of u(t) at time t, providing specific control levels for the Lebesgue integration, and/is the number of levels in the interval. For ease of notation, we let the number of levels I be the same over the whole time interval. The sum of α_iover i should equal one, for all t, since α_iis representing a probability. Also, upper and lower bounds on a; should be satisfied, i.e., 0≦α_i≦1 for all i and all t.
Notice that c_i(t) has the same dimension as u(t). And the number of levels I should be large enough to cover the range of u(t). For example, if u(t) is in two dimensions, and the number of levels on dimension 1 is three, and the number of levels on dimension 2 is two, then the number of levels in is six.
If we know z(t₀+kΔ), we could solve the chattering problem (7) for α_iat discrete times t₀+kΔ, t₀+kΔ+Δ, . . . , γ+kΔ−Δ using an optimization engine, such as FMINCON. This involves evaluating F at all discrete times and a levels, so we only want to solve this chattering problem occasion-ally. Instead of solving it frequently, we develop continualized incremental dynamics and an incremental optimization that has less computation. Then we only solve (7) when we have a large change in values or the accumulated error grows to be too large.
Also, if we know z(t₀+kΔ), we can easily solve the one-time period chattering problem for α_iz(t₀+kΔ). For a single time period, the chattering problem (7) reduces to
$\begin{matrix} \min_{α_{i} (t_{0} + k Δ) \forall i} Q (z (t_{0} + k Δ) + Δ \sum_{i = 1}^{I} F (z (t_{0} + k Δ), c_{i}) α_{i} (t_{0} + k Δ)) \sum_{i = 1}^{I} α_{i} (t_{0} + k Δ) = 1 0 \leq α_{i} (t_{0} + k Δ) \leq 1. & (8) \end{matrix}$
Notice that one-time period chattering problem (8) is very easy to solve, since it is a type of relaxed knapsack problem, so the optimal a is determined by rank ordering the coefficients in the objective (since the coefficients in the constraint are all equal to one). Also note that Q only has two non-zero values, so only the first and last elements of F need to be evaluated (at z(t₀+kΔ) and all c_i).
Hence we know analytically that the optimal c_iis given by,
$\begin{matrix} {α_{i} (t_{0} + k Δ)}^{*} = {\begin{matrix} 1 & \begin{matrix} if i provides the smallest value \\ of all QF (z (t_{0} + k Δ), c_{j}), j = 1, \dots, I, \end{matrix} \\ 0 & otherwise . \end{matrix} & (9) \end{matrix}$

2. Incremental Optimization and Continualized Incremental Dynamics

We start by adding an argument to the variables in (7) to indicate they are associated with the kth window problem. For instance, z(t, k) indicates the state at time tin the kth window. We also linearize with respect to F, yielding
z(t,k)=z(t,k−1)+δz(t,k)+O(Δ2) (10)
for t₀+kΔ≦t≦γ+(k−1)Δ. We use a continualization approach ?? to yield
$δ \dot{z} (t, k) = \sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z} α_{i} (t, k) δ z (t, k)$
where δz(t,k) is the change in z(t, k).
For most of the systems we are considering, the value of the Jacobian matrix,
$\frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z}$
changes slowly with the state, so we can approximate it by its value at the state at the beginning of the time window z(t₀+kΔ,k). Thus, the matrix is only computed at the beginning of the time window. It also could be computed using finite differencing.
With this approximation, the incremental optimization problem with continualized incremental dynamics is given by,
$\begin{matrix} \min_{{α_{i} (t, k)}} Q δ z (γ + k Δ, k) s . t . δ \dot{z} (t, k) = \sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z} α_{i} (t, k) δ z (t, k) \sum_{i = 1}^{I} α_{i} (t, k) = 1 0 \leq α_{i} (t, k) \leq 1. & (11) \end{matrix}$
From Pontryagin's minimum principle for the incremental problem (11), the three necessary conditions for optimality are,
$\begin{matrix} 1. δ {\dot{z}}^{*} (t, k) = (\sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z} α_{i} (t, k)) δ z^{*} (t, k) & (12) \end{matrix}$
with initial condition δz(t₀+kΔ)
$\begin{matrix} 2. \dot{p} (t, k) = - (\sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z} α_{i} (t, k)) p (t, k) & (13) \end{matrix}$
with terminal condition p(γ+kΔ)=Q^T. Notice that the p equation is not incremental (p, not δp), but it includes an approximation because the Jacobian matrix is evaluated at the beginning of the window.
$3.$ $p^{T} (t, k) \sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z} α_{i}^{*} (t, k) δ z^{*} (t, k) \leq p^{T} (t, k) \sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z} α_{i} (t, k) δ z (t, k)$
for α satisfying Σ_i=1 ^Iα_i(t,k)=1 and 0≦α_i(t,k)≦1.
Solving (12) and (13) will allow us to propagate over the length of the window, and then update the Jacobian matrix when the error reaches a threshold value.
The following discusses an example implementation that may be used for such chattering.

Setup

The optimization problem we're solving is the following:
$\min_{u} \int_{t_{0}}^{T} ℒ (x (t), u (t)) dt + ψ (x (T))$ $s . t . \dot{x} (t) = f (x (t), u (t))$
The function
is the objective function; this comes from adding the original objective function specified in the DSL with penalty terms for the rules.
The function f specifies the dynamics of the system; these are specified in the DSL.
The variable x represents the state of the system at time t; the variable u is the control (to be found) at time t.
Finally, ψ represents the cost at being in the terminal state x(T). This term is typically set to 0, but does not have to be
The rules-builder engine creates code that allows us to compute functions used in this system, specifically

- (x(t),u(t))
- ƒ(x(t),u(t))
- ω(x(t)

For these algorithms, we introduce two new (scalar) variables
$x_{n + 1} (t) = \int_{t_{0}}^{t} ℒ (x (τ), u (τ)) d τ$ $ (t) = \int_{t_{0}}^{t} \frac{\partial ψ}{\partial x} f (x (τ), u (τ)) d τ$
Let z be the vector
$z (t) = [\begin{matrix} x_{n + 1} (t) \\ x (t) \\  (t) \end{matrix}]$
Define the function F(x, u, t) as
$F (z (t), u (t)) = [\begin{matrix} ℒ (x (τ), u (t)) \\ f (x (τ), u (t)) \\ \frac{\partial ψ (x (t))}{\partial x (t)} f (x (t), u (t)) \end{matrix}]$
Also define the matrix Q=[10_1×n1].
Observe that all entries of F may be computed at any time based on the functions provided by the rule builder. (Use finite differencing to compute
$\frac{\partial ψ (\dot{x} (t))}{\partial x (t)} .)$

Chattering—Offline Training

The notation here is the same as in the paper Parametric
Chattering Control of Dynamic Systems.
Inputs:

- the functions
  , f, and ψ.
- a time horizon T
- the time step Δ.
- a time window size γ
- the initial state x(0)
- a guess δz₀for the initial rate of change in the state z
- the terminal costate p(T), typically assumed to be 0.
- u_minand u_max, the min and max possible values for each control u
- c, the number of control levels for each control u

Outputs:
Let N=T/Δ be the number of intervals.

- the optimal controls u(t) for each t=0, Δ, 2Δ, . . . , NΔ.
- the costate (assuming these controls) p(t) for each t=0, Δ, 2Δ, . . . , NΔ.
- the state (corresponding to picking optimal controls) x(t) for each t=0, Δ, 2Δ, . . . , NΔ.
- status information such as the number of iterations, whether there was a failure, etc.

Steps:


Algorithm 1 Finding optimal control within window k

1: for each component j = 1;...;k, construct r_jcontrol levels

by linearly interpolating from u_min _j to u_max _j

2: create a list all possible control combinations of each

level for each control j = 1;...;k. There are I = Π_i=1 ^kr_itotal such combi-

nations. Let c_ibe the ith such control combination, for i = 1;...;l.

3: Solve the following problem

\min_{α} Q z (γ + k Δ)

s . t . z (t + Δ) = z (t) + Δ \sum_{i = 1}^{I} F (z (t), c_{i}) α_{i} (t)

\sum_{i = 1}^{I} α_{i} (t) = 1 for t = t_{0} + k Δ, \dots, γ + k Δ - Δ

0 ≦ α_{i} (t) ≦ 1 for all i, t

Note this is a nonlinear optimization problem and is solved using a

nonlinear solver such as FMINCON.

4: the optimal control at each time is given by Σ_i=1 ^Iα_ic_i

5: find p(t) at every time from t = t₀to γ via

numerically integrating the following ordinary differential equation

backwards in time (using any standard solver):

p (γ, 0) = Q

\dot{p} (t, k) = - (\sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z} α_{i} (t, k) p (t, k))

6: find δz(t) at every time from t = t₀to γ by

numerically integrating the following ordinary differential equation

forwards in time:

δ z (0, 0) = {δz}_{0}

δ \dot{z} (t, k) = \sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k) c_{i})}{\partial z} α_{i} (t, k) δz (t, k)

Note that the matrix

\frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z}

is the same between this step and the previous, so it doesn't

have to be recomputed.

7: update the state z at every time using

z(t, k) = z(t, k − 1) + δz(t, k)

8: return the state z, the change in state δz, the chattering

levels α, the controls u and the costate p at every time in the window


Algorithm 2 Chattering offline training

1: Solve the chattering problem on the first window with k =

0 using Algorithm 1.

2: for k = 2; 3; ... (where k is the window number) do

3: Solve the incremental optimization problem to get the

optimal controls at the end of the window

\min_{α} Q δ z (γ + k Δ, k)

s . t . δ \dot{z} = \sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z} α_{i} (t, k) δ z (t, k)

\sum_{i = 1}^{I} α_{i} (t) = 1

0 ≦ α_{i} (t) ≦ 1 for all i

(This is a much easier optimization problem than finding the controls

in Algorithm 1 as it is only for a single time step.)

4: update our estimates of p(t) at every time in the window

by numerically integrating the following ODE backwards:

p (γ, 0) = Q

\dot{p} (t, k) = - (\sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k), c_{i})}{\partial z} α_{i} (t, k) p (t, k))

5: update δz(t) at every time in the window by

numerically integrating the following ODE forwards:

δ z (0, 0) = {δz}_{0}

δ \dot{z} (t, k) = \sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k) c_{i})}{\partial z} α_{i} (t, k) δz (t, k)

6: update the state z at every time using

z(t, k) = z(t, k −1) + δz(t, k)

7: end for

8: return the state z, the change in state δz, the chattering

levels α, the controls u and the costate p at every time from 0 to T

Chattering—Online Prediction
Inputs:

- the hamiltonian function H
- the time step Δ
- the current state x(t)
- the current costate p(t)
- u_minand u_max, the min and max possible values for each control u
- c, the number of control levels for each control u

Outputs:

- the optimal control u(t)
- the new state x(t+Δ)
- the new costate p(t+Δ)

Steps:


Algorithm 3 Chattering online updating

1: given x(t) and p(t) at the current time t.

2: compute optimal control u(t) by solving:

\min_{alpha} Q (z (t_{0} + k Δ) + Δ \sum_{i = 1}^{I} F (z (t_{0} + k Δ) α_{i} (t_{0} + k Δ)) s . t . \sum_{i = 1}^{I} α_{i} (t_{0} + k Δ) = 1 0 ≦ α_{i} (t_{0} + k Δ) ≦ 1

This problem has a closed-form solution, as mentioned in the paper.

3: u(t) = Σ_i=1 ^Iα_ic_i

4: update δz(t) over the last Δ using

δ z (0, 0) = {δz}_{0}

δ \dot{z} (t, k) = \sum_{i = 1}^{I} \frac{\partial F (z (t_{0} + k Δ, k) c_{i})}{\partial z} α_{i} (t, k) δz (t, k)

5: update the state z at every time using

z(t, k) = z(t, k − 1) + δz(t, k)

6: return z(t_i+1), u(t)

Chattering—Computing optimal values for T and Δ
Inputs

- The function F(x, u, t) (from the rule builder).
- A required error tolerance (provided in the DSL by the customer).
- an initial guess for T and Δ

Outputs
Δ, the time step to use
T, the time horizon
Steps


Algorithm 4 Finding Δ and T

1: Solve the offline chattering problem from t = 0 to t = T

to get controls and state for every time from t = 0 to T.

2: Compute the Jacobian matrices

A (t) = \frac{\partial F (z (t))}{\partial z (t)}

at each time t = 0 to t = T using finite differencing.

3: for i = 1; 2;...;N do

4: let t = iΔ

5: Compute the error matrix E(t) =∫₀ ^t(A(t) − A(0))e^A(0)(t−t ⁰ ⁾via

numerical integration.

6: Compute the total error as the sum of the absolute values of the

entries of E(t)

7: if E > error tolerance then

8: let T = (i − 1) Δ

9: break

10: end if

11: end for

12: Compute the Jacobian matrix

J = \frac{\partial F (z)}{\partial z}

at the current time using finite differencing.

13: Find the eigenvalues λ₁, . . . , λ_nof J

14: Take the eigenvalue λ_iwith largest magnitude |λ_i|.

15: Let Δ = 1/|λ_i|

The following discusses further details regarding possible use with such chattering.
In the chattering algorithm, there are two timing parameters, the length of the window T, and the time slice Δ (where NΔ=T). For accuracy purposes, these two parameters should be sufficiently small, but for computational efficiency, these two parameters should be large. Therefore, we want to keep the parameters as large as possible, as long as the error is tolerable.
The incremental state equation (for δz) relies on the Jacobian matrix evaluated at the beginning of the time window. To characterize the error, we compare two linear systems, one that has a constant matrix, denoted A(t₀), and the other has a time varying matrix, denoted A(t).
Consider both linear systems:
{dot over (x)}=A(t)x(t) (1)
{dot over (y)}=A(t ₀)y(t) (2)
with the same initial conditions, x(t₀)=y(t₀).
The solution to (1) is of the form
x(t)=φ(t,t ₀)x(t ₀)
where φ(t, t₀) satisfies several properties,
{dot over (φ)}(t,t ₀)=A(t)φ(t,t ₀)
φ(t ₀ ,t ₀)=I
φ⁻¹(t,t ₀)=φ(t ₀ ,t).
The solution to (2) is of the form
y(t)=e ^A(t−t ⁰ ⁾ x (t ₀).
The error due to approximating A(t) with A(t₀) is:
E(t)=φ(t,t ₀)−e ^A(t−t ⁰ ⁾.
Taking the derivative yields
{dot over (E)}(t)={dot over (q)}(t,t ₀)−A(t ₀)e ^A(t−t ⁰ ⁾
and replacing {dot over (φ)}(t, t₀) with A (t)φ(t, t₀) yields
{dot over (E)}(t)=A(t)φ(t,t ₀)−A(t ₀)e ^A(t−t ⁰ ⁾
and substituting for φ(t, t₀) yields
$\begin{matrix} \dot{E} (t) = A (t) (E (t) + e^{A (t - t_{0})}) - A (t_{0}) e^{A (t - t_{0})} \\ = A (t) E (t) + (A (t) - A (t_{0})) e^{A (t - t_{0})} . \end{matrix}$
This provides the error as a function of (A(t)−A(t₀)), and if the eigenvalues of A(t) for all t have real values less than zero, the error is linearly proportional to (A(t)−A(t₀)).
The procedure to choose T is to ask the customer what size of error is tolerable. The customer may say an error of 20% is tolerable, in which case the window length T can increase until reaching the associated threshold. The measure of the error is related to the difference,
$\sum_{i = 1}^{I} \frac{\partial F}{\partial z} α_{i} {\langle_{t} - \sum_{i = 1}^{I} \frac{\partial F}{\partial z} α_{i} \rangle}_{t_{0}}$
The size of the time slice Δ is related to the magnitude of the largest eigenvalue of
$\frac{\partial F}{\partial z} .$
Let |λ|=√{square root over (λ_real ²+λ_image ²)} and suppose |λ|_ihas the largest value. Then
$Δ = \frac{1}{{\langle λ \rangle}_{i}} .$
The following discusses further details regarding an example of synchronizing a decision module's model and current information with that of information from one or more other decision modules.
In the multi-agent self-organizing architecture of CDI, all agents synchronize in some way. This is analogous in the worst case to the many-body problem, which Isaac Newton first formulated, and is unsolvable for three or more bodies. However, good approximations for systems of more than two bodies are computationally tractable. The approach in CDI is to solve a series of two-body problems that emulate a mean field aggregation, and update sequentially through a Pareto game.
Consider N agents, and each agent i, i=1, . . . , N, has its own optimization problem with criterion, state and control:
Min_u _i J _i(x _i ,u _i)
s.t. {dot over (x)} _i =f(x _i ,u _i)
The N problem in a single optimization problem is not algorithmically solvable, but it is possible to solve algorithmically a Pareto-optimization problem with two players (the Pareto game), and approximate the solution to the N player problem. The two-agent (agent 1 and agent 2) Pareto-optimization problem is
Min_u ₁ _,u ₂ _,α ₁ _,α ₂α₁ J ₁(x ₁ ,u ₁)+α₂ J ₂(x ₂ ,u ₂)
subject to
{dot over (x)} ₁=ƒ(x ₁ ,u ₁)
{dot over (x)} ₂=ƒ(x ₂ ,u ₂)
α₁(t)+α₂(t)=1
The CDI approach is that each agent i plays a two-agent game with the CDI Mean Field agent composed of the criteria from all other agents except agent i. Instead of solving the two-agent optimization problem directly, we convert the formulation to a Hamiltonian problem (using Pontryagin's minimum principle).
The Hamiltonians are additive for Pareto optimization. Therefore, we can add local Hamiltonians for individual agents to create an aggregate mean field Hamiltonian.
Let H_ibe the local Hamiltonian for agent i and let H_i ^MFbe the mean field Hamiltonian composed of the Hamiltonians for all other agents, excluding i. That is, the mean field Hamiltonian for agent i is a functional form of the Hamiltonians of the other agents.
Then the two-agent optimization problem is
Min_u _i _,u _i _MF _α ₁ _,α ₂α₁ H _i(x _i ,u _i)+α₂ H _i ^MF(x _i ^MF(x _i ^MF ,u _i ^MF)
subject to the state and costate equations of the combined Hamiltonian, and
α₁(t)+α₂(t)=1.
The state consists of [x_i,x_i ^MF]^T, the costate is [p_i,p_i ^MF]^T, and the control is [u_i, u_i ^MF]^T. The initial conditions for x_i ^MFand p_i ^MFcome from the previous solution, and u_iis the local solution from the previous pass.
The solution to the two-agent Pareto game provides α₁(t) and α₂(t), and the total Hamiltonian for agent i is updated:
H _i ^T=α₁ H _i+α₂ H _i ^MF
The modified Hamiltonian for agent i is constructed by projecting the total Hamiltonian into the local state space. The modified mean field Hamiltonian is constructed by projecting the total Hamiltonian into the mean field state space.
Now, the another agent plays the game, solving the two-agent Pareto game, and updating locally, and each agent updates its mean field agent when it is ready to play its game.
The following discusses further details regarding example embodiments.

Overview

The Cooperative Distributed Inference (CDI) platform is a unique advanced technology offering to enable near-optimal and near-real-time decision making on vast amount of heterogeneous and distributed information, in complex knowledge-based decision support systems, combining absolute, hard and soft rules to handle various requirements from natural or governing laws, policies, and best practices. FIG. 17 illustrates a diagram 1700 of various interactions of different portions of a CDI system.
The CDI platform features a Distributed Architecture (DA) for resolving queries by accessing information from both an Internal Heterogeneous Database (IHDB) and external data sources referred to as Sensors. CDI utilizes a network of computing devices as its nodes—called Decision Elements (DE)—that cooperate to resolve queries given to them.
The Decision Elements (DE) in a given DA can work together to reach best outcome for the whole group, i.e., reaching Pareto Efficiency (also referred to as Pareto equilibrium)—a stable state where no change by any individual can be made to make the sum of whole group better.
Each DE can solve the problem independently if provided with complete knowledge and data. DE solves the query with a very unique approach, using Optimal Control Theory, starting with a technique called analytic continuation (transforming query and rules into differential equations whose dependent variables represent internal variables and parameters of the rules), then using its own internal knowledge to solve the query, providing an outcome.
In distributed environments, group of Decision Elements synchronize in an iterative process utilizing updates from other DEs and provide a final result, via a Pareto multi criteria optimization strategy.
The CDI platform needs rules, not necessarily exact criteria, to reach the objective state, producing the near-optimal solution. This is a very attractive feature when exact quantitative criteria cannot be provided in advance due to uncertainty or other reasons.

Platform Features

CDI is perfect for big data analytics, supporting many data types:

- Structured: databases, ontologies
- Semi-structured: spreadsheets, CSV/TSV, forms, emails
- Unstructured: web pages, blogs, text documents
- Symbolic: business rules, math formulas
  CDI's distributed computing architecture also make it very scalable to handle large amount (peta-size) of heterogeneous data ingestion while performing real-time analytic results even in microseconds (depending in part on resources available and the complexity of queries).

Rules Support

CDI can integrate different types of business rules: absolute, hard and soft rules, from natural or government laws, operational or policy requirements, and practice guidelines. Absolute rules and hard rules always take logic value 0 (false) or 1 (true) when instantiated. Soft rules, however, may take any value in the interval [0,1], or more generally more than two values. Absolute rules reflects a should-satisfy situation, such as FDA/USDA requirements; hard rules are operational requirements such as “no serious persistent side effects”, but which may be temporarily relaxed in specific situations; and soft rules can come from guidelines or experiences, such as “better not give drug XYZ to diabetes patients with heart problems”. FIG. 18 illustrates a diagram 1800 of examples of different types of rules. Please note that all chaining of the rules may happen automatically in a dynamic manner during the optimization process. CDI handles this complexity with said techniques above by solving a distributed continuous-space optimization problem.

Self-Adapting and Learning

CDI platform features a self-learning design. As CDI converts the original query solving into an optimal control problem, it can use feedbacks from the environment (external sensor or internal updates from peer decision elements) to refine its internal model: a Hamilton-Jacobi-Bellman equation will be updated to reflect new information and automatically form soft-rule like constraints internally. The process to update internal mathematical model is similar to reconstructing 3D images from CT or MRI imaging by tomographic reconstruction, but used for dynamic systems.

Scalability and Performance

CDI platform enables functionality to:

- Integrate data that may include 1,000,000+ variables and 100,000+ of constraints
- Provide data integration over evolving distributed network
- Specify queries over a broad range of languages
- Specify queries of a broad range of complexity
- Provide best known response to queries at the local level
- Operate in a variety of environments including cloud-based or local deployments
- Real-time processing of queries

Comparisons

CDI stands out in being able to do the following things:

- Approximate the optimal solution of NP-hard problems (such as planning and scheduling) by mapping criteria and constraints onto a continuous space and solve it as an Optimal Control problem with polynomial-time algorithms. The solutions offered by CDI may be near optimal rather than exactly optimal but the running time will be greatly shortened and some large-scale intractable problems can become solvable by CDI.
- Blend adaptive learning together with rules composition. Traditional rule engines might support hard and soft rules (constraints), but the scoring mechanism and weights need be manually adjusted to meet the goals, while in CDI, via feedback mechanism, these weights can be optimized as internal configurations.
  The unique abilities make CDI an ideal choice for intelligent decision support systems.

Sample Applications

CDI platform works great in areas where there are a lot of heterogeneous data, government compliances and business requirements, such as healthcare and energy.

Clinical Auto-Coding Application

A Clinical Auto-Coding (C.A.C.) application can be used to detect medical under-coding, over-coding, and miscoding, highlighting potential opportunities where higher billing is justified. It automatically generates clinical codes, including ICD-10 directly from clinical encounter notes such as physician notes, lab results, and discharge records, while supporting workflows accommodating the roles played by administrators, coders and doctors in coding. FIG. 19 illustrates an example user interface 1900 related to medical/clinical auto-coding. C.A.C. incorporates user feedback to learn coding best practices as it processes records. With its sophisticated adaptive technology, C.A.C. improves over time, optimizing coding for each organization to improve the efficiency, accuracy and revenue capture of the medical coding activity

Clinical Intelligence Web Services

CDI powers the following intelligent healthcare services that can be easily integrated:

- Clinical Record Intelligence—For analysis of EMRs (electronic medical records) and encounter notes, identification of actionable clinical terms and concepts in those documents. The services can extract concepts, understand issues, and analyze documents.
- Clinical Coding Intelligence—For inferring clinical codes from a set of clinical concepts, grouping codes, correlating codes, analyze documents, audit and justification, and compliance.
- Patient-Centric Intelligence—Secure delivery of end-user applications to predict, recommend, and explain personalized actions to improve patient outcomes. It provides a patient centric view, can provide medical risk prediction and proactive monitoring for patients, automated data abstraction, actionable recommendations, and more.
  All these web services run in secure cloud with full compliance measurement in place.

Fraud, Waste and Abuse Detection

In healthcare, CDI helps assess and monitor risk of medical fraud, waste and abuse (FWA) by uncovering providers, and to a lesser extent pharmacies, who are suspected of having committed fraud via a variety of schemes. Similar application domains include financial frauds.
By ingesting a wide variety of data that is difficult to link (including live public data), a FWA service analyzes data to find evidence and patterns of fraud, and provides a dashboard application for agents/detectives to prioritize and act on the discovered suspected cases of medical fraud. The service features strict and fuzzy rules-based detection as well as automatic pattern discovery, and runs in real-time and continuous mode to support proactive monitoring and action.

Energy Intelligence

Smart Grid can involve uncertain bi-directional exchange in distributed grids, so intelligent control can be used to provide active synchronization between the network of element controllers and the outside grid management system allows high quality of service to be maintained in a cost-effective manner, hence the dynamics can be learned from sensory observations. CDI enables distributed micro-grid control, supports inductive modeling with “soft” rules that are learned and continuously updated, to optimize the grid.

Distributed Architecture (DA)

The DA is a network of interacting components called decision elements (DEs). The DEs collaborate in the resolution of a query posed by one of them. The DA's block diagram is shown in FIG. 18, where Sensors refer to external input data in general. There is an additional translation layer to process external data to be consumed by DEs.

Decision Element (DE)

A decision element is a higher-level functional component solving queries locally. It can have subcomponents such as a programmable search engine, internal heterogeneous database, Inference engine, Inference rule base, API/user interface, and network interface. A decision element is capable of providing a quick and near-optimal solution to a complex query with complete input of data.

Internal Heterogeneous Database (IHDB) and External Knowledge Base (EKB)

IHDB is data preprocessed and stored by a specific decision element (DE). FIG. 20 illustrates a diagram 2000 of some components of a CDI system, including Knowledge Bases containing data sources ranging from domain knowledge to general facts. IHDB encodes knowledge into knowledge components (KC's). Each KC is used and updated by a DE in the DA, and multiple KCs may share rules.
External Knowledge Base (EKB) refers to data, including rules, as input into specific decision elements, such as patient's body temperature, blood pressure, or instantiated rule to determine if patient's cholesterol level is high. EKB can also contain communicated information from other DEs. Domains for variables include: real, complex, integer, binary numbers and symbolic token on finite domains.

Interfacing Components

The following components act between users and the data via API and/or GUI.

- Rule Entry Interface—It provides the entry of rules into the IHDB, validates the specification of rules before insertion, and route the rules to the appropriate DE for insertion to their respective knowledge components.
- Sensor Ingestion Interface—A sensor is a machine or service where external data exist. Sensor Ingestion Interface (SII) enables the system to add or remove a sensor, poll a sensor and submit data to the network.
- Query Language Interface—Query Language Interface accepts the query, submits it to the system (Distributed Architecture) and provides response. It is an API and flexible UI can be built on top of it.

Minimization Function Generator (MFG)

The minimization function generator converts a query to a minimization function (i.e. analytic continuation). This is useful because the problem is converted from search problem in a large discrete space (like graph search problems, which are usually NP-complete) into an optimization problem in continuous space (polynomial algorithms exist). An analogy is NP-hard integer programming, while continuous linear programming has a very efficient solution.

Query Response Engine (QRE)

The Query Response Engine is the core of the whole system responsible solving the query (locally). A mathematical model is constructed based on these equations obtained from previous steps, containing the current state and object state (goal state). The continuous-space optimization automatically handles forward and backward rule chaining by moving along the trajectory toward target state. It also manages uncertainty by keeping a large set of possible states and reducing the solution space only when more information becomes available. Standard optimization techniques (e.g. Newton-Raphson Method) can be employed to solve the problem.
Feedbacks and updates (data from EKB or other DE in the architecture) will be used to refine the mathematical model over time; therefore, the core engine is self-adapting.

Pareto Multi-Criteria Optimization Engine

The Pareto Multi-Criteria Optimization Engine (PMOE) is the aggregation step where all DE in the network settle to obtain a good stable solution—a state where no improvements can be made to any individual DE without reducing the whole team's performance—a state belongs Pareto Optimal Set in Game Theory. It is like each DE is playing a game and they communicate and interact and work together to make the best outcome for the criteria (query). To efficiently synchronize all decision elements, Mean Field Theory is applied for dimension reduction using knowledge obtained. FIG. 21 illustrates a diagram 2100 of performing Pareto processing for use with mean field techniques, including to reach Pareto efficiency.

Data Exchange Specifications

The system can take many different data types via ingestion API and has adapters from different public data sources. The supported data types include common ones from XML, CSV, TSV, SQL, Spreadsheet, JSON, and more. OData support is also available. Output types can be API (XML or JSON), as well as exporting to CSV, SQL or directly to services such as SOLR and Cassandra.

Running Environment

CDI can run in the cloud as Software as a Service platform. We can provide whole end-to-end support by setting up the infrastructure in a Virtual Private Cloud or deploy and configure it with the cluster clients provide. The system features SaaS architecture and provides APIs to be used by third parties. For advanced integration requests, Java/Scala and Python libraries are available.
To sum up, CDI platform utilizes many advanced techniques from Mean Field Theory, Lagrangian and Hamiltonian functions, Pareto Optimal Set, Gauge Theory, and so on derived from modern mathematics, quantum physics, optimal control and game theory to achieve high performance. In the detailed computation process, lots of transformation, approximation, optimization, caching and other advanced computing techniques are used to improve accuracy, speed and scalability. Due to this unique approach, CDI is able to solve complex decision making problem in a smoothly, efficient manner and achieve near optimal results.
The Cooperative Distributed Inference (CDI) platform is a unique advanced technology offering to enable near-optimal and near real-time decision making on vast amount of heterogeneous and distributed information, in complex knowledge-based decision support systems, combining absolute, hard and soft rules to handle various requirements from natural or governing laws, policies, and best practices.
Each agent in the Distributed Architecture is a Decision Element that can solve the problem independently, provided with knowledge base and data. In a very unique manner, the agent uses Optimal Control Theory to obtain a near optimal solution, starting with a technique called analytic continuation (transforming query and rules into differential equations whose dependent variables represent internal variables and parameters of the rules), then using its own internal knowledge to solve the problem as an optimization problem in continuous space, to allow for efficient solving.
The outcome will be fed back to the Rules Editor and optimization process (the Chattering algorithm) to enable automatic adjustment of weights of soft rules (constraints) and achieve optimal score of the objective function.
A CDI agent is an independent decision element that can take in sensor data (input from the environment, streaming or in batches), as well as the knowledge-bases (rules composed by domain experts, including absolute, hard and soft rules) to generate a set of partial differential functions (Lagrangian constraints), through continualization. Similarly, the objective is also converted into a minimization function, as part of the Lagrangians which will be solved via the Hamilton-Jacobi-Bellman equation. Each agent will talk to other agents via a “mean field” abstraction layer to greatly reduce the communication and computation overhead, and incorporate additional information to reach the global optimal state (a stable, near optimal set of states across all agents). Finally, the agent exercises control over the system. FIG. 22 illustrates a network diagram 2200 of an example decision module agent.
Various domain knowledge is captured from experts in the form of Domain Specific Language. Some are constraints; logic forms need be converted to Boolean equations; and variables in soft rules take values [0 . . . 1]. This creates a set of equations to be solved by the optimization algorithm. The optimization process takes data from a time range and finds the best state configuration to reach optimality. It starts with a “learning” process to find a good initial configuration by taking in a short history of data, before processing real-time streams. FIG. 23 illustrates a network diagram 2300 of an example of offline workflows for knowledge capture.
Each CDI agent computes a “mean field” view of the system via its neighbors, and responds to queries with the latest updates. The approximate mean field view of a group greatly reduces computational dimensions. The agents synchronize with others and understand the global state via the mean-field approximation. They engage in games to reach Pareto Optimal (also referred to as Pareto equilibrium)—the best output. FIG. 24 illustrates a network diagram 2400 of an example of workflows for mean field computation and Pareto Optimal.
An example home solar micro-grid system illustrates one example embodiment of a CDI application or automated control system, which takes the sensor data from the solar panel (in the house), substation, and power network, and decides whether or not to fulfill the utility requests in real time using a set of complex rules. FIG. 25 illustrates a network diagram 2500 of an example of an automated control system for a home solar micro-grid electrical generating system.
FIGS. 26-28 provide further details regarding operations of portions of example CDI systems and their sub-components, and FIGS. 29A-29K illustrate examples of using a CDI system to iteratively determine near-optimal solutions over time for controlling a target system in diagrams 2900A-2900K. In particular, FIG. 26 illustrates a further diagram 2600 of workflow and components of a portion of a CDI system, FIG. 28 illustrates a diagram 2800 of an overview workflow for a portion of a CDI system, and FIG. 27 illustrates a diagram 2700 of workflow for an inference process portion of a CDI system.
Further details related to an example CDI system are shown below and included in provisional U.S. Patent Application No. 62/015,018, filed Jun. 20, 2014 and entitled “Methods And Systems For Cooperative Distributed Inferencing,” which is hereby incorporated by reference in its entirety. In addition, further details related to example details of using gauges to perform model error measurements are included in provisional U.S. Patent Application No. 62/182,796, filed Jun. 22, 2015 and entitled “Gauge Systems,” which is also hereby incorporated by reference in its entirety. Furthermore, further details related to examples of using the CDD system in particular manners with particular types of target systems are included in provisional U.S. Patent Application No. 62/182,968, filed Jun. 22, 2015 and entitled “Applications Of Cooperative Distributed Control Of Target Systems,” which is also hereby incorporated by reference in its entirety—the example types of target systems discussed include, for example, a vehicle being controlled, controlling inventory management, controlling a micro-grid electrical generation facility, controlling activities in a day trader financial setting, etc.


	Current notation	Comments

	t	Notation for algorithmic time.
	q(t)	Notation of canonical
		coordinate vector for entire
		system.
	q	Notation of canonical
		coordinate vector dropping time
		dimension.
	q^(f)	Notation of canonical
		coordinate vector for specific
		function f.
	{dot over (q)}	Notation of first time derivative
		of canonical coordinate vector.
	{umlaut over (q)}	Notation of second time
		derivative of canonical
		coordinate vector.
	h	Notation of HEAD of Horn
		clause.
	φ(q)	Notation of generic proposition.
	σ(q)	Notation of generic proposition
		alternate to φ.
	T_i	Notation of the TV of a soft rule.
	{hacek over (r)}(q; φ, σ)	Generic equational form relating
		two propositions.
	{hacek over (φ)}(q)	Notation of the equational form
		of φ(q).
	φ_Q(q)	Notation for proposition defined
		by the query.
	{hacek over (φ)}_Q(q)	Notation for equation defined by
		the query.
	J(q)	Notation for minimization
		function for the query.
		Notation for static Lagrangian
	_k ^(o,T)	Notation for total static
		Lagrangian for DE_k.
	q
	{p_a}
	u^(k)
	H_k ^(o)	Primary Hamiltonian for the
		absolute rules for DE_k.
	H_k ^(A)	Hamiltonian for the Tellegen
		agent of the total Hamiltonian's
		rules.
	H_k ^(T)	Total Hamiltonian for DE_k.

Overview

This document introduces and specifies the architecture for the Cooperative Distributed Inferencing Platform (CDIP). The primary instance of this is the Distributed Architecture (DA) for resolving queries by accessing both an Internal Heterogeneous Database (IHDB) populated by a special class of Horn Clause rules and external data sources referred to as sensors.
The architecture implements a network of active devices at its nodes. These devices are called Decision Elements (DE's). The DE's cooperate in the resolution of a query posed to one or several of them. The DE's in a given DA are referred to as the team.
Every DE in a team is programmed to transform rules in its domain, determined by a posed query, into an ordinary differential equation (ODE), whose dependent variables represent internal variables and parameters. The dependent variables include unknowns of the query posed to the DE. The DE's in the architecture are synchronized via a Pareto multi criteria optimization strategy.
The components of the CDIP include:

- Application requirements that the system is designed to accommodate.
- Functional requirements that satisfy the application requirements and pertain directly to the construction and operation of the system components.
- Subcomponents which are necessary to implement the functional requirements
- Limitations which highlight noteworthy constraints which are inherent the specified implementation of the architecture.
- Architectural flow describes key aspects of the architecture that indicate how the system is to be constructed given the specified essence and key behavior of the subcomponents.
- Software realization of the architecture describes the key pieces of software necessary for system implementation.
- Data describes the kinds of data the system is expected to accept as input and produce as output.
- Data exchange protocols reference key data types and structures that need to be exchanged across the system and the protocols for exchange.
- Environment describes the particulars of the environments that the system will be able to operate in and therefore should be tested in.
- Testing describes how the system should be tested given the data and operating environments.

Subcomponents

Subcomponents are fundamental parts of the architecture that perform particular roles. This section contains descriptions for each of the subcomponents of the architecture. The subcomponents are:

- The Distributed Architecture (DA).
- The Internal Heterogeneous Database (IHDB).
- The Rule Entry Interface (REI).
- The Rule Editor (RE).
- The External Knowledge Base (EKB).
- The Sensor Ingestion Interface (SII).
- The Rule Conversion Engine (RCE).
- The Decision Element (DE).
- The Query Language Interface (QLI).
- The Minimization Function Generator (MFG).
- The Query Response Engine (QRE).
- The Pareto Multi-Criteria Optimization Engine (PMOE).

Distributed Architecture (DA)

The DA is a platform of interacting components called DE's. The DE's collaborate in the resolution of a query posed by one of them. The DE's implement a distributed, dynamic optimization process, herein referred as the optimization process (OP). OP implements an optimization process that computes an answer to the active queries as a function of data stored in two categories of repositories: IHDB and EKB's. These repositories of the data needed to implement OP given a query.
The EKB's are a collection of public or private repositories of knowledge relevant to the DE posing a query. A DE has a list of external repositories (LER). Each entry in an LER includes 1) a protocol, 2) a heading sub-list, and 3) a translation grammar. Each protocol entry prescribes the access procedure to the corresponding knowledge repository. Each heading sub-list entry contains a summary of the knowledge contents of the corresponding repository. Finally, each translation grammar entry provides a procedure for converting knowledge elements of the corresponding repository in to the rule representation in the IHDB of the DE.
Functional characteristics of this architecture and in particular, the DE's, IHDB, and the sensors are described, including the following concepts:

- The DA
- A process for resolving queries by accessing the IHDB and External Knowledge Bases (EKB's) through sensors
- The constitution of DE's
- A query and corresponding rules transformation into an ODE
- The orchestration of a team of DE's through a Pareto multi criteria optimization strategy

The Internal Heterogeneous Database (IHDB)
Composition of IHDB as a Set of Knowledge Components
The IHDB encodes knowledge about the implemented application. The IHDB is divided into knowledge components (KC's). Each KC is consulted and updated by a DE in the DA. Any pair of KC's may have an overlapping set of rules by which they operate, but there is no a priori constraint on intersections or inclusion. The collection of KC's constitutes the existing knowledge of the system.
Algorithmic Formulation of a Rule
A KC is a collection of rules, written in a restrictive Horn clause format. The rules are logic entities. When a rule is instantiated, it has a logic value. The logic values a rule can have are taken from the interval [0,1]. The entire system of rules is evaluated using variables and parameters which are collectively referred to as the generalized coordinates of the system and are indexed as follows:
q(t)={q ⁽¹⁾(t), . . . ,q ^(N)(t)}. (3.2-1)
The time argument t refers to the algorithmic time of the system which means that it has no other meaning than as a continuous index with respect to the evolution of the system. There is therefore no requirement that it correspond to a physical aspect of the system although this may naturally occur. Physical time may be represented specifically by a canonical coordinate of choice q⁽ⁱ⁾(t). Alternatively, we may refer to the q's without expressly stating the independent time argument and write
q(t)={q ⁽¹⁾ , . . . ,q ^(N)}. (3.2-2)
Then we should also note that the time derivatives are referred to notationally as
$\begin{matrix} \dot{q} = \frac{dq (t)}{dt}, \ddot{q} = \frac{d^{2} q (t)}{{dt}^{2}} & (3.2 - 3) \end{matrix}$
These coordinates are referred to variously as q depending on the context and the expected arguments of the function to which they are applied. When it is necessary to distinguish between more than one q in equational form we generally write q_fwhere f denotes the reference function or appropriate domain. Typically, we assume without loss of generality the entire set of canonical coordinates q is an argument to any function, term or proposition. In practice, we may further assume it is possible to apply the particular required coordinates as need to mathematical construct in question.
The rules in each knowledge component are of three types: absolute rules, hard rules, and soft rules. Absolute rules and hard rules take logic value 0 (false) or 1 (true) when instantiated. Soft rules take any value in the interval [0,1].
The format of the restrictive Horn Clauses in the IHDB is illustrated in Equation 3.2-2. A Horn Clause is an object composed of two objects a HEAD and a BODY connected by backward Implication (
) The logic implication transfers the logic value of the BODY to the HEAD. If the rule is an absolute rule or a hard rule, the logic value is 1(if the BODY is logically true) or 0 (if the BODY is logically false). If the rule is a soft rule, the logic value transferred by the body is any number in [0,1].
The HEAD is a data structure composed of two objects: A name, h, and a list of arguments described by the argument vector q=(q⁽¹⁾, . . . , q⁽ⁿ⁾). The list of arguments includes variables and parameters. The variables take values in the domain of the rule and the parameters are constants passed to the rule and unchanged by the instantiation of the rule. The domain of the rule is a set of values that each of its variables can take. In general, variables can take values over numerical or symbolic domains. As an example, a symbolic domain can be a list of diseases. A numeric domain can be a set of pairs of numbers representing blood pressure.
For the applications of CDI, the domains for variables are: real numbers, complex numbers (floating point and floating point complex numbers), integer numbers, binary numbers and symbolic token on finite domains.
The BODY of a clause is a data structure composed of one or more terms, denoted φ_i(q). The composition operation is extended- and, denoted by:
. The extended- and works as a regular and in absolute rules and hard rules and as a functional product²on soft rules.
A rule with a head but not a body is called a fact. A fact's truth value is determined on the basis of the instantiation of its variables.
Each term in the body of a rule is an extended disjunction (or denoted by
) of sub-terms. The
operator behaves like the standard- or for absolute and hard rules and behaves in a functional form, described later, when connecting sub-terms encoding heads of soft rules.
A sub-term is either the HEAD of a rule, a relation or a truth valuation (TV). When it is a HEAD it may have the same name as the one in the HEAD of the rule but with different arguments. This provides a recursive mechanism for rule evaluation.
When a rule has a sub-term that is the head of another rule it is said that the two rules are chained together by the corresponding sub-term. Note that a rule can be chained to several rules via corresponding sub-terms.
Constraint Domains
Constraint domains augment the BODY clause of Horn clauses to facilitate dynamic programming. Constraints are specified as a relationship between terms. Define the relationship between two terms
φ(q)relσ(q). (3.2-4)
as a member of the following set
relε{=,≠,≦,≧, statistical propagation}. (3.2-5)
A relation can be of two types numeric or symbolic. Numeric relations establish equational forms between two functional forms. (For the initial phase only polynomial and affine linear functional forms will be considered.)
In general, an equational form is a set of one or more relations. For numeric relations, φ(q) rel σ(q), rel ε {=, ≠, ≦, ≧, <, >, statistical propagation}. The table below gives the relations considered and their symbols.

TABLE EE

Numeric Relation	Symbol	Code Form

Equality	=	φ = σ
Disequation	≠	φ\ = σ
Less-inequality	<	φ < σ
Less-Equal	≦	φ =< σ
Great-equality	>	φ > σ
Great-inequality	≧	φ >= σ

The adopted code forms are the ones used in constraint logic programming.
A symbolic relation can be of two types: inclusion and constraint. Inclusion relations are of the form:
qεSet (3.2-6)
Where x is a variable or a parameter, ε is the inclusion symbol and Set is a set of symbolic forms or a set of numbers or a composite set of the form shown in the table below.
TABLE FF

Composite Set Symbol Code Form

Intersection ∩ Set1/\Set2

Union ∪ Set1\/Set2

Complement \ \Set

Constraint forms of the symbolic relational type may be one or a set of the forms presented in the table below For numeric relations, φ(q) rel σ(q), relε{=≠, ⊂, ⊃, ⊂, ⊃}.

TABLE GG

Symbolic Relation	Symbol	Code Form

Equal	=	φ# = σ
Not Equal	≠	φ#\ = σ
Is Contained	⊂	φ# < σ
Contains	⊃	φ# > σ
Is Contained or Equal	⊂	φ# =< σ
Contains or Equal	⊃	φ# >= σ

A TV is either a variable or a constant with values in the interval [0,1]. The TV of a Horn Clause that is an absolute rule or a hard rule can only take two values: 1 or 0. The TV when instantiated is 0 or 1. If the TV for an absolute or hard rule is 1, the rule is said to be in inactive state; if the TV is 0, the rule is said to be in active state.
The TV, T_i, of a soft rule satisfies
0≦T _i≦1. (3.2-7)
If T_iabove satisfies,
T _i ≧T _threshold (3.2-8)
the soft clause is said to be in inactive state. If
T _i <T _threshold, (3.2-9)
the soft clause is said to be in active state, where T_thresholdis a constant in [0,1] defined for each soft clause. The default value is 0.5.
This concludes the description of the knowledge representation. The instantiation process of the goal in a DE, as function of its knowledge base, is carried out by the inference engine of the DE. This process is the central component of CDI and will be described later on the document.

Summary of Terminology

The following table summarizes the terminology we have just reviewed.

	TABLE HH

	Reference term	Definition

	proposition	Defined as a construct as in the
		propositional calculus where the proposition
		takes on the value of true or false.
	term	Recursively according to its assigned sub-
		term.
	sub-term	A sub-term may be a Horn clause, a
		relation between two other sub-terms or an
		extended truth valuation depending on the
		context of absolute, hard or soft rules. In
		the case of absolute and hard rules it may
		be evaluated as a proposition. In the case
		of soft rules it takes a value on the interval
		[0, 1] and is considered to be active or true
		in the case that it exceeds its specific
		threshold.
	Horn clause	A disjunction of terms with at most one
		positive term.
	definite clause	A Horn clause with exactly one positive
		term.
	goal clause	A Horn clause with no positive terms.
	fact	A definite clause with no negative terms.
	head	The positive term of a definite clause.
	inactive state	The case when a rule will not apply for
		constrained optimization.
	active state	The case when a rule will apply for
		constrained optimization.
	truth value, TV	The value that is used to determine whether
		a rule is active or inactive.

Horn Clause Example

The following example illustrates a Horn clause:
has fever(name, temperature, white_count, heartrate, blood_pressure)
(temperature >37)
((heartrate ≧70)
bp (name, temperature, blood_pressure)
wc(name, white_count)) (3.2-10)
The clause establishes under which conditions the patient of name name, has a fever. The variables in clause are:
name, temperature, white_count, heartrate, blood_pressure.
When instantiated they represent, respectively, the name of the patient, his current body temperature, his white blood cell count, his heart rate range, and his blood pressure.
The clause body includes other clauses: bp (blood pressure) and we (white count).
This completes the specification of the rule-based framework. The next step is to specify a complete process for converting all rules of this form to a set of equations.
Rule Entry Interface (REI)
The Rule Entry Interface provides a mechanism for:

- Providing an API for the entry of rules into the IHDB.
- Validating the specification of rules to be inserted into the IHDB.
- Routing the rules to the appropriate DE's for insertion to their respective KC's.

Rule Editor (RE)
The Rule Editor allows users to specify rules associated with the systems to be interrogated.
External Knowledge Base (EKB)

- It may be distributed or not
- It may be persisted or not
- It may be persisted locally or remotely to an agent
- It may or may not be architecturally co-located with the IHDB
- A sensor may include any source of data used by the agent
- It may use various types of buses for data communication
- Sensors may or may not be co-located with agents/DEs

Rule Conversion Engine (RCE)
The rule conversion engine converts rules of the IHDB into equations.
Method for Specification of a Simple Term as an Equation
Consider the term φ(q) with the following truth assignment.
$\begin{matrix} ϕ (q) = {\begin{matrix} T & q \in _{ϕ} \\ F & q \in _{ϕ} ⋐  \end{matrix} & (3.6 - 1) \end{matrix}$
Then we can define the set of arguments which yield positive truth assignment.
{qε
_φ|φ(q)←T}. (3.6-2)
Define the corresponding equation {hacek over (φ)}(q) of the term φ(q) as
$\begin{matrix} \overset{ˇ}{ϕ} (q) = {\begin{matrix} 1 & ϕ (q) \leftarrow T \\ 0 & ϕ (q) \leftarrow F \end{matrix} & (3.6 - 3) \end{matrix}$
Then extend the range of {hacek over (φ)}(a) to the closed unit interval
{hacek over (φ)}(q)→[0,1]. (3.6-4)
Revisiting the taxonomy of absolute, hard and soft rules, we recognize that hard and soft rules (terms in this example) can take values along the interval
0≦{hacek over (φ)}(q)≦1. (3.6-5)
And for absolute rules we add the constraint {hacek over (φ)}(a)→{0,1}
{hacek over (φ)}(q)(1−{hacek over (φ)}(q))=0. (3.6-6)
Conversion of Fundamental Clauses of Propositional Calculus to Equations
Define the following notation for the propositional calculus.

	TABLE II

	Symbol	Function

		And
		Or
		Implication
	~	Not
	∃	Exists
	∀	All
	Fuzzy rule

Theorem 3.6.1.
Given the method for the specification of equations from propositions, we prove the following transformations.

TABLE JJ

Proposition	Equation

~φ(q)	1 −{hacek over (φ)}(q)
φ(q) σ(q)	{hacek over (φ)}(q) · {hacek over (σ)}(q)
φ(q) σ(q)	{hacek over (φ)}(q) + {hacek over (σ)}(q) − {hacek over (φ)}(q) · {hacek over (σ)}(q)
φ(q) σ(q)	1 − {hacek over (φ)}(q) + {hacek over (φ)}(q) · {hacek over (σ)}(q)

φ₁(q) φ₂(q) ··· φ_k−1(q) φ(q) φ(q) (tail recursive)	$\overset{⋁}{\tilde{ϕ}} (n, q) = \frac{\overset{⋁}{h} (n - 1, q)}{\overset{⋁}{\tilde{σ}} (n, q) \overset{⋁}{\tilde{ϕ}} (n - 1, q) - 1}$

Proof by enumeration for equational representation of conjunction
Define the function {hacek over (r)}(q; φ, σ) which represents the equation corresponding to conjunction (
). Verify by enumeration the correspondence of the mathematical equation values corresponding to the mapping T→1 and F→0.

TABLE KK

φ(q)		σ(q)	{hacek over (r)}(q; φ, σ) = {hacek over (φ)}(q) · {hacek over (σ)}(q)

T	T	T		1 = 1 · 1
T	F	F		0 = 1 · 0
F	F	T		0 = 0 · 1
F	F	F		0 = 0 · 0

Proof by enumeration for equational representation of disjunction
Define the function {hacek over (r)}(q; φ, σ) which represents the equation corresponding to disjunction (
). Verify by enumeration the correspondence of the mathematical equation values corresponding to the mapping T→1 and F→0.

TABLE LL

φ(q)		σ(q)	{hacek over (r)}(q; φ, σ) = {hacek over (φ)}(q) + {hacek over (σ)}(q) − {hacek over (φ)}(q) · {hacek over (σ)}(q)

T	T	T		1 = 1 + 1 − 1 · 1
T	T	F		1 = 1 + 0 − 1 · 1
F	T	T		1 = 0 + 1 − 0 · 1
F	F	F		0 = 0 + 0 − 0 · 0

Proof by enumeration for equational representation of negation
Define the function {hacek over (r)}(q; φ, σ) which represents the equation corresponding to negation (
). Verify by enumeration the correspondence of the mathematical equation values corresponding to the mapping T→1 and F→0.

TABLE MM

φ(q)	~φ(q)	{hacek over (r)}(q; φ, σ) = 1 − {hacek over (φ)}(q)

T	F		0 = 1 − 1
F	T		1 = 1 − 0

Proof by enumeration for equational representation of implication
Define the function {hacek over (r)}(q; φ, σ) which represents the equation corresponding to disjunction (
). First note the equivalence of
φ(q)
σ(q) and ˜φ(q)
σ(q) (3.6-7)
Verify by enumeration the correspondence of the mathematical equation values corresponding to the mapping T→1 and F→0.

TABLE NN

~φ(q)		σ(q)	{hacek over (r)}(q; φ, σ) = 1 − {hacek over (φ)}(q) + {hacek over (φ)}(q) · {hacek over (σ)}(q)

T	T	T		1 = 1 − 1 + 1 · 1
T	F	F		0 = 1 − 1 + 1 · 0
F	T	T		1 = 1 − 0 + 0 · 1
F	T	F		1 = 1 − 0 + 0 · 0

Proof for equational representation of tail recursion

Tail recursion is propositionally defined as
φ(q)
φ₁(q)
φ₂(q)
. . .
φ_k−1(q)
φ(q) (3.6-8)
where s represent the current state. To develop an equational representation of the recursive formulation, first define the general function {tilde over (φ)}(n, q) where n represents the n^thiteration of the tail recursion ad {tilde over (φ)}(n, q) is the logical consequent. Then rewrite the above formulation using the recursive step.
{tilde over (φ)}₁(n,q)
{tilde over (φ)}₂(n,q)
. . .
{tilde over (φ)}_k−1(n,q)
{tilde over (φ)}(n−1,q)
{tilde over (φ)}(n,q) (3.6-9)

Define

{tilde over (σ)}(n,q)≈{tilde over (φ)}₁(n,q)
{tilde over (φ)}₂(n,q)
. . .
{tilde over (φ)}_k−1(n,q)
{tilde over (r)}(n−1,q)≈{tilde over (σ)}(n,q)
{tilde over (φ)}(n−1,q) (3.6-10)
Then the tail recursion is rewritable as
{tilde over (σ)}(n,q)
{tilde over (φ)}(n−1,q)
{tilde over (φ)}(n,q)
{tilde over (r)}(n−1,q)
{tilde over (φ)}(n,q). (3.6-11)
According to the equational representation of implication, let
{hacek over (h)}(n−1,q)=1−
(n,q)·
(n−1,q)+
(n,q)·
(n−1,q)·
(n,q). (3.6-12)
Since by definition {tilde over ({circumflex over (r)})}(n−1, q)=
(n, q)·
(n−1, q). Then
$\begin{matrix} \overset{ˇ}{\tilde{ϕ}} (n, q) = \frac{\hat{h} (n - 1, q) + \hat{\tilde{σ}} (n, q) \cdot \hat{\tilde{ϕ}} (n - 1, q) - 1}{\hat{\tilde{σ}} (n, q) \cdot \hat{\tilde{ϕ}} (n - 1, q)} & (3.6 - 13) \end{matrix}$
with boundary condition n=0.
Converting rules based system of inference to the problem of constrained minimization
Consider the following example.

- Converting rules to constraints
  The preceding discussion has established an algorithm for convert rules of the form

h(q)
φ₁(q)
φ₂(q)
. . .
φ_m(q) (3.6-14)
To constraints of the form
{hacek over (h)}(q)={hacek over (φ)}₁(q)·{hacek over (φ)}₂(q)· . . . ·{hacek over (φ)}_m(q). (3.6-15)
Decision Element (DE)
A diagram of the Decision Element Architecture is shown in illustration OO.
It is composed six elements:

- Programmable search engine (PSE)
- Internal heterogeneous database (IHDB)
- Inference engine (IE)
- Inference rule base (IRB)
- API/user interface
- Network interface (NI)

List of External Repositories (LER)
A DE has a List of External Repositories (LER). Each entry in an LER includes 1) a protocol, 2) a heading sub-list, and 3) a translation grammar. Each protocol entry prescribes the access procedure to the corresponding external knowledge repository. Each heading sub-list entry contains a summary of the knowledge contents of the corresponding repository. Finally, each translation grammar entry provides a procedure for converting knowledge elements of the corresponding repository in to the rule representation in the IHDB of the DE.
Programmable Search Engine
The programmable search engine implements a standard hashing algorithm for detecting active rules as a function of the current instantiation of the variables in a variable buffer (VB) of the IE, and the contents of the active rule buffer (ARB). The VB contains the variables that form part of the query and all additional variables incorporated to this buffer during the inference process (IP). The VB includes all relevant data from the EKB beneficial to perform the query. The IP will be described below. The ARB contains all the currently active rules in the IP.
The search hashing algorithm is characterized by the search rules in the Inference Rule Base.
Internal Heterogeneous Database
The IHDB is the repository of the application clauses associated with the DE. These encode the domain of knowledge characterizing the expertise of the DE. For example in a medical application, a decision element may deal with expertise on heart illnesses, and the corresponding clauses might encode diagnoses and treatments for these diseases.
Inference Ermine
The IE encodes an algorithm, the IP, for assigning values to the variables appearing in the query.
Inference Rule Types
The DE incorporates inference rules (IR) that are a collection of rules for transforming and inferring instantiations of the goal. These rules provide the Inference Engine with directives for processing database rules to give a satisfactory instantiation to a given query or to request additional information so that a satisfactory instantiation can be generated. They are organized according to their functionality as follows. (See Illustration PP)
Equation Rules
These rules include the formal rules for inference. This includes all rules for natural language modeling from first principles.
Optimizer Rules
These rules include rules for finding the interior point in optimization.
Search Rules
These rules include rules for identifying the nature of insufficient potential. The goal is to apply these rules to acquire additional information required to satisfy the optimization goal.
Adaptation Rules
Adaptation rules are used to update the soft rules to relax them further to reduce the complexity and constrains of the optimization problem. The adaptation also serves to update the search rules to improve information acquisition.
Language Statistics and Pattern Rules
These rules embody the machine learning models.
Network Rules
These rules define how information is distributed over the network and what information is available from which resources.
Hybridization Rules
The rules define how other rules may be combined.
User Interface
The UI provides the utilities for entering queries, pragma rules, displaying query answers, status and for general interaction with the IE.
Network Interface
The NI provides a generic mechanism for interacting with other DE's via a procedure termed companionship. The companionship procedure implements the active coupling for the cooperation of the DE's in query resolution. This procedure is not hierarchical and implements a Pareto Agreement set strategy as the mechanism for CDI.
Query Language Interface (QLI)
Information about the QLI is available elsewhere herein.
Process for Determining Active Constraints
The process for determining active constraints is available elsewhere herein.
Minimization Function Generator (MFG) and Determining Active Constraints
The minimization function generator converts a query to a minimization function. Again, we assume without loss of generality the entire set of canonical coordinates q is an argument to any proposition φ_i. In practice, we may further assume it is possible to apply the particular required coordinates as need to the proposition or function in question. Then let φ^(k)be the set of propositions associated with DE_kin the context of query Q. These propositions are composed of the proposition associated with the query φ_Q(q), and other propositions φ_i(q), comprising the constraints of the system. The proposition φ_Q(q) associated with a given query Q can be converted to an equation {hacek over (φ)}_Q(q). Queries that are satisfiable specify a set.
{q|φ _Q(q)←T} (3.10-1)
Similarly, a satisfied query represented as an equation is also a set
{q|{hacek over (φ)} _Q(q)=1}. (3.10-2)
Relaxing the values that {hacek over (φ)}_Q(•) can take to include the unit interval so that soft rules are incorporated yields the following constrained optimization expression. Let J(q)=({hacek over (φ)}_Q(q)−1)². Then specify the optimization
$\begin{matrix} \min_{q} j (q) {\overline{X}}_{o} & (3.10 - 3) \end{matrix}$

Subject to:

- 1. {hacek over (φ)}_Q(q)≦1
- 2. {hacek over (φ)}_Q(q)≧0
- 3. A knowledge base on the set {{hacek over (φ)}_i(q), . . . , {hacek over (φ)}_n(q), . . . {hacek over (φ)}_n+s(q)}⊂{hacek over (φ)}^(k)which represents a further set of active constraints specific to the problem:
  - a. {hacek over (φ)}_i(q)≧0 for 1≦i≦n,
  - b. {hacek over (φ)}_i(q)≦1 or, equivalently −({hacek over (φ)}_i(q)−1)≧0 for 1≦i≦n,
  - c. and in the case of absolute rules {hacek over (φ)}_l(q)(1−{hacek over (φ)}_l(q))==0 for n<l≦n+s.
    Introduce the indicator functions

$\begin{matrix} V_{{\overset{⋁}{ϕ}}_{i}}^{-} = {\begin{matrix} 0 & {\overset{⋁}{ϕ}}_{i} (q) \geq 0 \\ \infty & {\overset{⋁}{ϕ}}_{i} (q) < 0 \end{matrix} & (3.10 - 4) \\ and \\ V_{{\overset{⋁}{ϕ}}_{i}}^{+} = {\begin{matrix} 0 & 1 - {\overset{⋁}{ϕ}}_{i} (q) \geq 0 \\ \infty & 1 - {\overset{⋁}{ϕ}}_{i} (q) < 0 \end{matrix} & (3.10 - 5) \end{matrix}$
which yields the two logarithmic barrier functions
{hacek over (V)}{hacek over (φ)}_i ⁻=−log({hacek over (φ)}_i(q)) (3.10-6)
and
{hacek over (V)}{hacek over (φ)}_i ⁺=−log({hacek over (φ)}_i(q)) (3.10-7)
According to the method of Lagrange multipliers, combine this with the equality constraints to form the static Lagrangian function
$\begin{matrix} ℒ (q; {\overset{⋁}{ϕ}}_{Q}, {\overset{⋁}{ϕ}}^{(k)}, ω_{1}^{(+)}, \dots, ω_{n}^{(+)}, ω_{n + 1}^{(-)}, \dots, ω_{2 n}^{(-)}, ω_{2 n + 1}^{(λ)}, \dots, ω_{2 n + s}^{(λ)}, ω_{2 n + s + 1}^{(Q)}, ω_{2 n + s + 2}^{(Q)}) = {\overset{⋁}{ϕ}}_{Q} (q) + \sum_{i = 1}^{n} [ω_{i}^{(+)} {\overset{⋓}{V}}_{{\overset{⋁}{ϕ}}_{i}}^{+} + ω_{n + i}^{(-)} {\overset{⋓}{V}}_{\overset{⋁}{ϕ}}^{-}] + \sum_{l = 1}^{s} ω_{2 n + 1}^{(λ)} {\overset{⋁}{ϕ}}_{1} (q) (1 - {\overset{⋁}{ϕ}}_{1} (q)) - ω_{2 n + s + 1}^{(Q)} \log ({\overset{⋁}{ϕ}}_{Q} (q)) - ω_{2 n + s + 2}^{(Q)} \log (1 - {\overset{⋁}{ϕ}}_{Q} (q)), & (3.10 - 8) \end{matrix}$
the roots of which can be found using a formulation of Newton-Raphson. Since
here includes absolute, hard and soft rules we may call it the total static Lagrangian for DE_kand refer to it as
^(T).
Construct Equations of Motion
Information for equations of motion is available elsewhere herein.
Query Response Engine (QRE) which Includes Process for Constructing Differential Equations
Application of Newton-Raphson
Consider a continuous analog of the independent variables of
(•)
$\begin{matrix} q = q (t) = [\begin{matrix} q^{(1)} (t) \\ ⋮ \\ q^{(v)} (t) \end{matrix}] & (3.12 - 1) \end{matrix}$
where each of the v total independent variables of
(•) is mapped to its corresponding position in q(t), the column vector that is represented with a lower-case q. To reiterate, the independent variable t refers algorithmic time as opposed to physical time which may also be represented in the system. The corresponding unconstrained optimization goal can be written as
$\begin{matrix} \min_{q} ℒ (q^{(1)} (t), \dots, q^{(v)} (t)) & (3.12 - 2) \end{matrix}$
so that ∇L(q)
$\begin{matrix} \nabla ℒ (q (t)) = [\begin{matrix} \frac{\partial ℒ}{\partial q^{(1)}} \\ ⋮ \\ \frac{\partial ℒ}{\partial q^{(v)}} \end{matrix}] = [\begin{matrix} \nabla ℒ_{1} \\ ⋮ \\ \nabla ℒ_{v} \end{matrix}] = 0, & (3.12 - 3) \end{matrix}$
with positive definite Hessian matrix
$\begin{matrix} \nabla^{2} ℒ (q (t)) = [\begin{matrix} \frac{\partial ℒ}{\partial q^{(1)} \partial q^{(1)}} & \dots & \frac{\partial ℒ}{\partial q^{(1)} \partial q^{(v)}} \\ ⋮ & ⋱ & ⋮ \\ \frac{\partial ℒ}{\partial q^{(v)} \partial q^{(1)}} & \dots & \frac{\partial ℒ}{\partial q^{(v)} \partial q^{(v)}} \end{matrix}] = [\begin{matrix} \nabla ℒ_{11} & \dots & \nabla ℒ_{1 v} \\ ⋮ & ⋱ & ⋮ \\ \nabla ℒ_{v 1} & \dots & \nabla ℒ_{vv} \end{matrix}] = > 0. & (3.12 - 4) \end{matrix}$
Write the recursion for Newton's method
q _(k+i)(t)=q _(k)(t)−(∇²
(q _(k)(t)))⁻¹∇
(q _(k)(t)). (3.12-5)
This is equivalently rewritten
$\begin{matrix} \frac{q_{(k + 1)} (t) - q_{(k)} (t)}{δ} = - \frac{1}{δ} {(\nabla^{2} ℒ (q_{(k)} (t)))}^{- 1} \nabla ℒ (q_{(k)} (t)) . & (3.12 - 6) \end{matrix}$
Via continualization we approximate the derivative
$\begin{matrix} \dot{q} (t) = \frac{dq (t)}{dt} = - {(\nabla^{2} ℒ (q (t)))}^{- 1} \nabla ℒ (q (t)) . & (3.12 - 7) \end{matrix}$
Translation of Inverted Matrix
Consider M, an invertible and positive definite matrix. Then we make the following provable assertions.

- 1. A^TA is symmetric.
- 2. −A^TA has negative eigenvalues.

Define

$\begin{matrix} \frac{dM (t)}{dt} = - A^{T} AM (t) + A^{T} & (3.12 - 8) \end{matrix}$
Then as t→∞, M(t)→A⁻¹=∇²
(q_(k)(t))⁻¹. Using (3.12-3) and (3.12-4) approximate {dot over (q)}(t) by rewriting the derivative in the context of M(t). This yields the following two equations.
$(3.12 - 9) \dot{q} (t) = - M (t) \nabla L (q (t)) = [\begin{matrix} m_{11} & \dots & m_{1 v} \\ ⋮ & ⋱ & ⋮ \\ m_{v 1} & \dots & m_{vv} \end{matrix}] [\begin{matrix} \nabla ℒ_{1} \\ ⋮ \\ \nabla ℒ_{v} \end{matrix}] = [\begin{matrix} m_{11} \nabla ℒ_{1} + \dots + m_{1 v} \nabla ℒ_{v} \\ ⋮ \\ m_{v 1} \nabla ℒ_{1} + \dots + m_{vv} \nabla ℒ_{v} \end{matrix}] (3.12 - 10) \frac{dM (t)}{dt} = - {(\nabla^{2} ℒ (q (t)))}^{T} (\nabla^{2} ℒ (q (t))) M (t) + {(\nabla^{2} ℒ (q (t)))}^{T} = - [\begin{matrix} \nabla ℒ_{11} & \dots & \nabla ℒ_{1 v} \\ ⋮ & ⋱ & ⋮ \\ \nabla ℒ_{v 1} & \dots & \nabla ℒ_{vv} \end{matrix}] [\begin{matrix} \nabla ℒ_{11} & \dots & \nabla ℒ_{1 v} \\ ⋮ & ⋱ & ⋮ \\ \nabla ℒ_{v 1} & \dots & \nabla ℒ_{vv} \end{matrix}] [\begin{matrix} m_{11} & \dots & m_{1 v} \\ ⋮ & ⋱ & ⋮ \\ m_{v 1} & \dots & m_{vv} \end{matrix}] + [\begin{matrix} \nabla ℒ_{11} & \dots & \nabla ℒ_{1 v} \\ ⋮ & ⋱ & ⋮ \\ \nabla ℒ_{v 1} & \dots & \nabla ℒ_{vv} \end{matrix}] = - [\begin{matrix} \begin{matrix} \nabla ℒ_{11}^{2} + & \dots + \nabla ℒ_{v 1}^{2} \end{matrix} & \dots & \nabla ℒ_{11} \nabla ℒ_{1 v} + \dots + \nabla ℒ_{v 1} \nabla ℒ_{v v} \\ ⋮ & ⋱ & ⋮ \\ \nabla ℒ_{11} \nabla ℒ_{1 v} + \dots + \nabla ℒ_{v 1} \nabla ℒ_{v v} & \dots & \begin{matrix} \nabla ℒ_{1 v}^{2} + & \dots + \nabla ℒ_{1 v}^{2} \end{matrix} \end{matrix}] [\begin{matrix} m_{11} & \dots & m_{1 v} \\ ⋮ & ⋱ & ⋮ \\ m_{v 1} & \dots & m_{vv} \end{matrix}] + [\begin{matrix} \nabla ℒ_{11} & \dots & \nabla ℒ_{v 1} \\ ⋮ & ⋱ & ⋮ \\ \nabla ℒ_{1 v} & \dots & \nabla ℒ_{vv} \end{matrix}] = - [\begin{matrix} \begin{matrix} (\nabla ℒ_{11}^{2} + \dots + \nabla ℒ_{v 1}^{2}) m_{11} + \dots + \\ (\nabla ℒ_{11} \nabla ℒ_{1 v} + \dots + \nabla ℒ_{v 1} \nabla ℒ_{vv}) m_{v 1} + \\ \nabla ℒ_{11} \end{matrix} & \dots & \begin{matrix} (\nabla ℒ_{11}^{2} + \dots + \nabla ℒ_{v 1}^{2}) m_{1 v} + \dots + \\ (\nabla ℒ_{11} \nabla ℒ_{1 v} + \dots + \nabla ℒ_{v 1} \nabla ℒ_{vv}) m_{v v} + \\ \nabla ℒ_{v 1} \end{matrix} \\ ⋮ & ⋱ & ⋮ \\ \begin{matrix} (\nabla ℒ_{11} ℒ_{1 v} + \dots + \nabla ℒ_{v 1} \nabla ℒ_{vv}) m_{11} + \dots + \\ (\nabla ℒ_{1 v}^{2} + \dots + \nabla ℒ_{vv}^{2}) m_{v 1} + \nabla ℒ_{1 v} \end{matrix} & \dots & \begin{matrix} (\nabla ℒ_{11} \nabla ℒ_{1 v} + \dots + \nabla ℒ_{v 1} \nabla ℒ_{vv}) m_{1 v} + \dots + \\ (\nabla ℒ_{1 v}^{2} + \dots + \nabla ℒ_{vv}^{2}) m_{v v} + \nabla ℒ_{vv} \end{matrix} \end{matrix}] = [\begin{matrix} \nabla m_{11} & \dots & \nabla m_{1 v} \\ ⋮ & ⋱ & ⋮ \\ \nabla m_{v 1} & \dots & \nabla m_{vv} \end{matrix}]$
The approximation proceeds as follows, applying the Magnus expansion to compute the integral:
1. Fix M(0)=∇²
(q(0)) and =∇²
(q(t)).
2. Use the variation of constants formula to solve
$M (T) = e^{- {[\nabla^{2} ℒ (q (T))]}^{2}}^{t} M (0) + [\int_{0}^{T} e^{- {[\nabla^{2} ℒ (q (τ))]}^{2}} (T - τ) d τ] \nabla^{2} ℒ (q (T))$
The following is computation flow, flowing down unless otherwise indicated.
Process for Determining Dynamic Lagrangian Via Hemholtz Equations

Given

$\begin{matrix} G_{i} (\ddot{q}, \dot{q}, q) = \sum_{j = 1}^{n} W_{i, j} (\dot{q}, q) (\dot{q}, q) {\ddot{q}}^{(j)} + K_{i} (\dot{q}, q) = 0 j = 1, \dots, n & (3.12 - 11) \end{matrix}$
If the three conditions
$\begin{matrix} \frac{\partial G_{i}}{\partial {\ddot{q}}^{(i)}} = \frac{\partial G_{j}}{\partial {\ddot{q}}^{(i)}}, \frac{\partial G_{i}}{\partial {\ddot{q}}^{(j)}} + \frac{\partial G_{j}}{\partial {\ddot{q}}^{(i)}} = \frac{d}{dt} (\frac{\partial G_{i}}{\partial {\ddot{q}}^{(j)}} + \frac{\partial G_{j}}{\partial {\ddot{q}}^{(i)}}), \frac{\partial G_{i}}{\partial q^{(j)}} - \frac{\partial G_{j}}{\partial q^{(i)}} = \frac{1}{2} \frac{d}{dt} (\frac{\partial G_{i}}{\partial {\dot{q}}^{(j)}} + \frac{\partial G_{j}}{\partial {\dot{q}}^{(i)}}), & (3.12 - 12) \end{matrix}$
with i, j=1, . . . , n hold, then
$\begin{matrix} \sum_{j = 1}^{n} \frac{\partial^{2} L}{\partial {\dot{q}}^{(i)} \partial {\dot{q}}^{(j)}} {\ddot{q}}^{(j)} + \frac{\partial^{2} L}{\partial q^{(j)} \partial {\dot{q}}^{(i)}} - \frac{\partial L}{\partial q^{(i)}} = G_{i}, i = 1, \dots, n & (3.12 - 13) \end{matrix}$
This is a second order, linear hyperbolic differential equation on the Lagrangian L. It can be solved efficiently by the method of characteristics.

Let

$\begin{matrix} G (\ddot{q}, \dot{q}, q) = [\begin{matrix} q (t) \\ \dot{q} (t) \\ \dot{M} (t) \end{matrix}] = [\begin{matrix} q^{(1)} \\ ⋮ \\ q^{(v)} \\ m_{11} \nabla ℒ_{1} + \dots + m_{1 v} \nabla ℒ_{v} \\ ⋮ \\ m_{v 1} \nabla ℒ_{1} + \dots + m_{vv} \nabla ℒ_{v} \\ \nabla m_{11} \\ ⋮ \\ \nabla m_{v 1} \\ ⋮ \\ \nabla m_{1 v} \\ ⋮ \\ \nabla m_{vv} \end{matrix}] & (3.12 - 14) \end{matrix}$
Process for Determining Hessian Rank of Dynamic Lagrangian
Information for determining Hessian rank of dynamic Lagrangian is available elsewhere herein.
Converting the Lagrangian to the Hamiltonian Via the Legendre Transformation.
In our formulation the Lagrangian, L_k ^(T)(q, {dot over (q)}; ω), may be converted to the Hamiltonian using the Legendre transformation, so that
$\begin{matrix} H_{k}^{(T)} (q, p; ω) = \frac{\partial L_{k}^{(T)}}{\partial \dot{q}} \dot{q} - L_{k}^{(T)} (q, \dot{q}; ω) = p^{T} \dot{q} - L_{k}^{(T)} (q, \dot{q}; ω) & (3.12 - 15) \end{matrix}$
Pareto Multi-Criteria Optimization Engine (PMOE)
Consider the problem of determining the relaxed Pareto optimal solution to a given system query at a given time step. There are N decision elements, k=1, . . . , N. A given decision element, DE_k, has the following associated parameters which are constituent to the ARB:

- A generalized set of coordinates relevant to D E_k, q.
- A generalized set of linearly independent momenta {p_a} where the index a refers the linearly independent momenta selected from the canonical set p.
- A set of control parameters co for hard a soft rules of the system, where 0≦ω_i≦1.

The ARB has the following components which determine the constraints of DE_k:

- The Hamiltonian which identifies the fundamental dynamics of the system of the system for the k'th decision element denoted

H _k ^(o)(q,{p _a}). (3.13-1)

- The summation of the first class constraints of the system, which is

$\begin{matrix} \sum_{i} ω_{i} f_{i} (q^{(i)}, ω_{i}) & (3.13 - 2) \end{matrix}$

- The summation of the second class constraints of the system which is

$\begin{matrix} \sum_{i} g_{i} (q^{(i)}, ω_{i}) & (3.13 - 3) \end{matrix}$

- The Tellegen agent which is a function of the Hamiltonians of the absolute rules of the other N−1 decision elements in the system

H _k ^(A) =F _k ^(A)(H ₁ ^(T) , . . . ,H _k−1 ^(T) ,H _k+1 ^(T) , . . . ,H _K ^(T)) (3.13-4)

- The total Hamiltonian of the system is denoted F^(T).
- Approximations to the various Hamiltonian's are denoted Ĥ_k ^(A), Ĥ^(T)and Ĥ_k ^(o)for the Tellegen, total, and DE-level Hamiltonians respectively.

System Initialization
Determining the relaxed Pareto optimal point of the system is a process which includes:

- Initialization of N decision elements.
- Synchronization through companionship of each of the N decision elements with its respective Tellegen agent.

System Operation
Illustration SS shows how decision elements interact with the network, receive queries, and return results. In this example, the distributed system effectively implements an abstract classifier that has no real implementation. The DE's receive sensor data from the network which includes new available information which may benefit classification. The user submits a query that is received by a DE which then returns a result.
Gauge Systems in a Hamiltonian Domain
The time integral of the Lagrangian L(q, {dot over (q)}) is the action S_Ldefined as
S _L=∫_t ₁ ^t ² L(q,{dot over (q)})dt
where
$\dot{q} = \frac{dq (t)}{dt} .$
The Lagrangian conditions for stationarity are first that
$\begin{matrix} \frac{d}{dt} L_{{\dot{q}}^{(n)}} - L_{q^{(n)}} = 0 & (3.14 - 1) \end{matrix}$
where n=1, . . . , N,
$L_{{\dot{q}}^{(n)}} = \frac{\partial L}{\partial {\dot{q}}^{(n)}}, and$ $L_{{\dot{q}}^{(n)}} = \frac{\partial L}{\partial q^{(n)}} .$
And, secondarily
$\begin{matrix} [\sum_{n^{'} = 1}^{N} {\ddot{q}}^{(n^{'})}] L_{{\dot{q}}^{(n)} {\dot{q}}^{(n)}} = L_{{\dot{q}}^{(n)}} - {\dot{q}}^{(n)} L_{{\dot{q}}^{(n)} {\dot{q}}^{(n)}} & (3.14 - 2) \end{matrix}$
where
${\ddot{q}}^{(n^{'})} = \frac{d^{2} q^{(n^{'})}}{{dt}^{2}} and$ $L_{{\dot{q}}^{(n)} {\dot{q}}^{(n)}} = \frac{\partial^{2} L}{\partial {({\dot{q}}^{(n)})}^{2}} .$
The generalized accelerations {umlaut over (q)}⁽ⁿ⁾are immediately determined if L_{{dot over (q)}} _(n) _{{dot over (q)}} _(n)is invertible, or equivalently
det(L _{{dot over (q)}} _(n) _{{dot over (q)}} _(n))≠0 (3.14-3)
for i=1, . . . , N. If for some n, det (L_{{dot over (q)}} _(n) _{{dot over (q)}} _(n))=0, the acceleration vector {umlaut over (q)}⁽ⁿ⁾will not be uniquely determined.
The departing point for the Hamiltonian approach is the definition of conjugate momentum
p _n =L _{{dot over (q)}} _(n) (3.14-4)
where n=1, . . . , N. We will see that (3.14-3) is the condition of non-invertibility of
$L_{\dot{q} \dot{q}} = [\begin{matrix} L_{{\dot{q}}^{(1)} {\dot{q}}^{(1)}} & \dots & L_{{\dot{q}}^{(1)} {\dot{q}}^{(N)}} \\ ⋮ & ⋰ & ⋮ \\ L_{{\dot{q}}^{(n)} {\dot{q}}^{(1)}} & \dots & L_{{\dot{q}}^{(N)} {\dot{q}}^{(N)}} \end{matrix}]$
of the velocities of the functions of the coordinates q and momenta p. In other words, in this case, the momenta defined in (3.14-4) are not all independent. Define the relations that follow from (3.14-4) as
φ_m(q,p) (3.14-5)
where m=1, . . . , M. Write (3.14-4) in vector notation as
p=L _{{dot over (q)}}(q,{dot over (q)}).
Then compatibility demands
φ_m(q,L _{{dot over (q)}}(q,{dot over (q)}))=0
is an identity with m=1, . . . , M.
Relations specified in (3.14-5) are called primary constraints. For simplicity let's assume that rank(L_{{dot over (q)}{dot over (q)}}) is constant throughout the phase space, (q, {dot over (q)}), so that (3.14-5) defines a submanifold smoothly embedded in the phase space. This manifold is known as the primary constraint surface. Let
rank(L _{{dot over (q)}{dot over (q)}})=N−M′ (3.14-6)
Then there are M′ independent constraints among (3.14-5) and the primary constraint surface is a phase space submanifold of dimension 2N−M′.
We do not assume that all the constraints are linearly independent so that
M′≦M. (3.14-7)
It follows from (3.14-5) that the inverse transformation from the p's to the q's is multivalued. That is, given q, p that satisfies (3.14-5), the inverse image (q, {dot over (q)}) that satisfies
$\begin{matrix} p = {(\frac{\partial L}{\partial \dot{q}})}^{T} & (3.14 - 8) \end{matrix}$
is not unique, since (3.14-8) defines a map from a 2N-dimensional manifold (q, to the smaller (2N−M′)-dimensional manifold. Thus the inverse image of the points of (3.14-5) form a manifold of dimension M′.
Conditions on the Constraint Function
There exist many equivalent ways to represent a given surface by means of equations of the form of (3.14-5). For example the surface p₁=0 can be represented equivalently by p₁ ²=0, √{square root over (|p₁|)}=0, or redundantly by p₁=0 and p₁ ²=0. To use the Hamiltonian formalism, it is necessary to impose some restrictions which the regularity conditions for the constraints.
Regularity Conditions
The (2N−M′)-dimensional constraint surface φ_m(q, p) should be covered of open region: in each region the constraints can be split into independent constraints
{φ_m′ |m′=1, . . . ,M′}.
Their Jacobian matrix
${\frac{\partial φ_{m^{'}}}{\partial p_{n}, q^{(n)}}} = [\begin{matrix} \frac{\partial φ_{1}}{\partial p_{1}, q^{(1)}} & \dots & \frac{\partial φ_{1}}{\partial p_{n}, q^{(n)}} \\ ⋮ & ⋱ & ⋮ \\ \frac{\partial φ_{m^{'}}}{\partial p_{1}, q^{(1)}} & \dots & \frac{\partial φ_{m^{'}}}{\partial p_{n}, q^{(n)}} \end{matrix}]$
with m′=1, . . . , M′ and n=1, . . . , N, is of rank M′.
The dependent constraints φ_m, m=M′+1, . . . , M of the other φ_m′, =0
φ_m, =0. Alternatively the condition on the Jacobian.

- 1. The function φ_m′can be taken locally as the first M′ coordinates of a new regular system in the vicinity of the constraint surface or the differentials dφ₁, . . . , dφ_M′, are locally linearly independent:

dφ ₁
. . .
dφ _M′≠0 (3.14-9)

- 2. The variations δφ_m′ are of order ε for arbitrary variations δq⁽ⁱ⁾, δp_iof order ε (Dirac's approach).
  Theorem 3.14.1. If a smooth, phase space function G vanishes on {φ_m=0} then

$\begin{matrix} G = \sum_{m = 1}^{M} G^{(m)} φ_{m} & (3.14 - 10) \end{matrix}$

Proof:

(local proof). Set φ_m′, m′=1, . . . , M′ as coordinates (y_m′,x_α) with y_m′=φ_m′. In these coordinates G(0,x)=0 and
$\begin{matrix} \begin{matrix} G (y, x) = \int_{0}^{1} \frac{d}{dt} G (ty, x) dt \\ = \sum_{m^{'} = 1}^{M^{'}} y_{m^{'}} \int_{0}^{1} \frac{\partial}{\partial y_{m^{'}}} G (ty, x) dt \\ = \sum_{m^{'} = 1}^{M^{'}} g^{(m^{'})} (y, x) φ_{m^{'}} (y, x) \end{matrix} & (3.14 - 11) \\ with \\ g^{(m^{'})} (y, x) = \int_{0}^{1} \frac{\partial}{\partial y_{m^{'}}} G (ty, x) dt . \end{matrix}$

Theorem 3.14.2.

If the sum Σ(λ⁽ⁿ⁾δ_q ⁽ⁿ⁾+μ_nδp_n)=0 for arbitrary variations δq⁽ⁱ⁾, δp_itangent to the constraint surface {φ_m(q, p)=0|m=1, . . . , M}, then
$\begin{matrix} λ^{(n)} = \sum_{m = 1}^{M} u^{(m)} \frac{\partial φ_{m}}{\partial q^{(n)}} & (3.14 - 12) \\ μ_{n} = \sum_{m = 1}^{M} u^{(m)} \frac{\partial φ_{m}}{\partial p_{n}} & (3.14 - 13) \end{matrix}$

Proof.

The dimension of {φ_m} is 2N−M′. Thus the variations at a point (p, q) forms a 2N−M′ dimensional space
$\begin{matrix} \sum_{n = 1}^{N} (λ^{(n)} δ q^{(n)} + μ_{n} δ p_{n}) = 0 & (3.14 - 14) \end{matrix}$
By the singularity assumption, there exists exactly M′ solutions to (3.14-14). Clearly, the gradients
${\frac{\partial Φ_{m^{'}}}{\partial q^{(n)}}} and$ ${\frac{\partial Φ_{m^{'}}}{\partial p_{n}}}$
are linearly independent. They are the basis for solutions to (3.14-14).
Note that in the presence of redundant constraints, the functions u^(m)exist but are not unique.
Canonical Hamiltonian
The Hamiltonian in canonical coordinates is
$\begin{matrix} H (q, p) = \sum_{n = 1}^{N} {\dot{q}}^{(n)} p_{n} - L (q, \dot{q}) & (3.14 - 15) \end{matrix}$
The rate {dot over (q)} enters through the combination through conjugate momenta defined for each coordinate
p _n(q,{dot over (q)})=L _{{dot over (q)}} _(n))(q,{dot over (q)}) (3.14-16)
This remarkable property is essential for the Hamiltonian approach. It is verified by evaluating the change δH involved by arbitrary independent variations of position and velocities.
$\begin{matrix} \begin{matrix} δ H = \sum_{n = 1}^{N} ({\dot{q}}^{(n)} δ p_{n} + δ {\dot{q}}^{(n)} p_{n}) - δ L \\ = \sum_{n = 1}^{N} ({\dot{q}}^{(n)} δ p_{n} + δ {\dot{q}}^{(n)} p_{n}) - \sum_{n = 1}^{N} (L_{q^{(n)}} δ q^{(n)} + L_{{\dot{q}}^{(n)}} δ {\dot{q}}^{(n)}) \end{matrix} & (3.14 - 17) \end{matrix}$
Utilizing (3.14-16) in (3.14-17) yields
$\begin{matrix} δ H = \sum_{n = 1}^{N} ({\dot{q}}^{(n)} δ p_{n} - L_{q^{(n)}} δ q^{(n)}) & (3.14 - 18) \end{matrix}$
The Hamiltonian defined by (3.14-15) is not unique as a function of p, q. This can be inferred from (3.14-18) by noticing that {δp_n|n=1, . . . , N} are not all independent. They are restricted to preserve the primary constraints φ_n≈0 which are identities when the p's are expressed as functions of q's via (3.14-16).
Using the definition of the differential in several variables applied to δH=δH({q⁽ⁿ⁾}, {p_n}), (3.14-18) can be rewritten
$\begin{matrix} \sum_{n = 1}^{N} (\frac{\partial H}{\partial q^{(n)}} δ q^{(n)} + \frac{\partial H}{\partial p_{n}} δ p_{n}) = \sum_{n = 1}^{N} ({\dot{q}}^{(n)} δ p_{n} - δ q^{(n)} \frac{\partial L}{\partial q^{(n)}}) or \sum_{n = 1}^{N} (\frac{\partial H}{\partial q^{(n)}} + \frac{\partial H}{\partial q^{(n)}}) δ q^{(n)} + \sum_{n = 1}^{N} (\frac{\partial H}{\partial p_{n}} - {\dot{q}}^{(n)}) δ p_{n} = 0 & (3.14 - 19) \end{matrix}$
From theorem 2 we then conclude for each n that.
$\begin{matrix} \frac{\partial H}{\partial q^{(n)}} + \frac{\partial L}{\partial q^{(n)}} = \sum_{m = 1}^{M} u^{(m)} \frac{\partial Φ_{m}}{\partial q^{(n)}} and \frac{\partial H}{\partial p_{n}} - {\dot{q}}^{(n)} = \sum_{m = 1}^{M} u^{(m)} \frac{\partial Φ_{m}}{\partial p_{n}} . & (3.14 - 20) \end{matrix}$
So for each n:
$\begin{matrix} {\dot{q}}^{(n)} = \frac{\partial H}{\partial p_{n}} + \sum_{m = 1}^{M} u^{(m)} \frac{\partial Φ_{m}}{\partial p_{n}}, n = 1, \dots, N & (3.14 - 21) \\ and \\ - \frac{\partial L}{\partial q^{(n)}} = \frac{\partial H}{\partial q^{(n)}} + \sum_{m = 1}^{M} u^{(m)} \frac{\partial Φ_{m}}{\partial q^{(n)}}, n = 1, \dots, N . & (3.14 - 22) \end{matrix}$
Note that if the constraints are independent, the vectors
$\sum_{n = 1}^{N} \frac{\partial Φ_{m}}{\partial p_{n}},$
m=1, . . . , M are also independent because of the regularity conditions (this is proved later). Hence no two sets of {u^(m)|_m=1, . . . , M} can yield the same velocities via (3.14-21).
Thus, using
${\dot{q}}^{(n)} = \frac{\partial H}{\partial p_{n}} + \sum_{m = 1}^{M} u^{(m)} (q, \dot{q}) \frac{\partial Φ_{m}}{\partial p_{n}} (q, p (q, \dot{q}))$
we can find u^(m)(p, {dot over (q)}). If we define the transformation from (q, {dot over (q)}) to the manifold {φ_m(q, p)=0|m=1, . . . , M}, from q, {dot over (q)}, u→q, p, u by
q=q,n=1, . . . ,N
p _n =L _q _(n)(q,{dot over (q)}),n=1, . . . ,N−M′
u ^(m) =u ^(m)(q,{dot over (q)}),m=1, . . . ,M′
We see that this transformation is invertible since one has from q, p, u→q, {dot over (q)}, u
$q = q$ ${\dot{q}}^{(n)} = \frac{\partial H}{\partial p_{n}} + \sum_{m = 1}^{M} u^{(m)} \frac{\partial Φ_{m}}{\partial p_{n}}$ $φ_{m} (q, p) = 0$
Thus invertibility of the Legendre transformation when
det(L _{{dot over (q)}{dot over (q)}})=0
can be regained at the prices of adding extra variables.
Action Principle of the Hamiltonian Form
With (3.14-21) and (3.14-22) we can rewrite (3.14-1) in the equivalent Hamiltonian form
$\begin{matrix} {\dot{q}}^{(n)} = \frac{\partial H}{\partial p_{n}} + \sum_{m = 1}^{M} u^{(m)} \frac{\partial Φ_{m}}{\partial p_{n}} {\dot{p}}^{(n)} = - \frac{\partial H}{\partial p_{n}} - \sum_{m = 1}^{M} u^{(m)} \frac{\partial Φ_{m}}{\partial p_{n}} Φ_{m} (q, p) = 0, m = 1, \dots, M^{'} & (3.14 - 23) \end{matrix}$
The Hamiltonian Equations (3.14-23) can be derived from the following variational principle:
$δ \int_{t_{1}}^{t_{2}} [\sum_{n = 1}^{N} q^{(n)} p_{n} - H - \sum_{m = 1}^{M} u^{(m)} φ_{m}] = 0$
for arbitrary variations of δq⁽ⁿ⁾, δp_n, and δu^(m)subject to
δq(t ₁)=δq(t ₂)=0
where the u^(m)appear now as Lagrange multipliers enforcing the primary constraints
φ_m(q,p)=0,m=1, . . . ,M.
Let F(p, q) be an arbitrary function of the canonical variables, then
$\begin{matrix} \begin{matrix} \frac{dF}{dt} = \sum_{n = 1}^{N} \frac{\partial F}{\partial q^{(n)}} {\dot{q}}_{n} + \sum_{n = 1}^{N} \frac{\partial F}{\partial p_{n}} {\dot{p}}_{n} \\ = \sum_{n = 1}^{N} \frac{\partial F}{\partial q^{(n)}} [\frac{\partial H}{\partial p_{n}} + \sum_{m = 1}^{M} u^{(m)} \frac{\partial Φ_{m}}{\partial p_{n}}] + \\ \sum_{n = 1}^{N} \frac{\partial F}{\partial p_{n}} [- \frac{\partial H}{\partial p_{n}} - \sum_{m = 1}^{M} u^{(m)} \frac{\partial Φ_{m}}{\partial p_{n}}] \\ = [F, H] + \sum_{m = 1}^{M} u^{(m)} [F, φ_{m}] \end{matrix} & (3.14 - 25) \end{matrix}$
The equation (3.14-25) introduces the new binary operator [•,•] which is the Poisson bracket and has the form
$\begin{matrix} \begin{matrix} [F, G] = \sum_{n = 1}^{N} [\frac{\partial F}{\partial q^{(n)}} \frac{\partial G}{\partial p_{n}} + \frac{\partial F}{\partial p_{n}} \frac{\partial G}{\partial q^{(n)}}] \\ = \sum_{n = 1}^{N} [F_{q^{(n)}} G_{p_{n}} + F_{p_{n}} G_{q^{(n)}}] \end{matrix} & (3.14 - 26) \end{matrix}$
Secondary Constraints
The basic consistency condition is that the primary constraints be preserved in time. So for
F(p,q)=φ_m(q,p)
we should have that {dot over (φ)}_m=0. {φ_m(q, p)=0}. So this means
$\begin{matrix} [φ_{m}, H] + \sum_{m^{'} = 1}^{M^{'}} u^{(m^{'})} [φ_{m}, φ_{m^{'}}] = 0 & (3.14 - 27) \end{matrix}$
This equation can either reduce to a relation independent of the u^(m′), or, it may impose a restriction on the u's.
u=−{[φ _m,φ_m′]}[φ_m ,H](q,p) (3.14-28)
In the case (3.14-27) is independent of the u's (3.14-27) is called a secondary constraint. The fundamental difference of secondary constraints with respect to primary constraints is that primary constraints is that primary constraints are the consequence of the definition (3.14-8) while secondary constraints depend on the dynamics.
If X(q, p)=0 is an external constraint, we most impose a compatibility condition
$\begin{matrix} [X, H] + \sum_{m = 1}^{M^{'}} u^{(m)} [X, φ_{m}] = 0 & (3.14 - 29) \end{matrix}$
Next we need to test whether this constraint:
$\begin{matrix} Φ (p, q) = [X, H] + \sum_{m = 1}^{M^{'}} u^{(m)} [X, φ_{m}] = 0 & \begin{matrix} (3.14 - 30) \\ (3.14 - 31) \end{matrix} \end{matrix}$
Implies new secondary constraints or whether it only restricts the u's. After the process is finished we are left with a number of secondary constraints which will be denoted by
φ_k=0,k=M+1, . . . ,M+K
where K is the total number of secondary constraints. In general, it will be useful to denote all the constraints (primary and secondary) in a uniform way as
φ_j(q,p)=0,j=1, . . . ,M+K=J (3.14-32)
We make the same regularity assumptions on the full set of constraints.
Weak and Strong Equations
Equation (3.14-32) can be written as
φ_j(•)≈0 (3.14-33)
To emphasize, the quantity φ_jis numerically restricted to be zero but does not vanish throughout the space. What this is means is that φ_jhas non-zero Poisson brackets with the canonical variables.
Let F, G be functions that coincide on the manifold {φ_j≈0|j=1, . . . , K} are said the be weakly equal and denoted by F≈G. On the other hand, an equation that holds throughout the entire phase space and not just on the submanifold {φ_j≈0} is called strong. Hence, by theorem 1
$\begin{matrix} F \approx G \Leftrightarrow F - G = \sum_{j = 1}^{J} c^{(j)} (p, q) φ_{j} & (3.14 - 34) \end{matrix}$
Restrictions on the Lagrange Multipliers
Assume that we have found a complete set of constraints
{φ_j≈0|j=1, . . . ,J} (3.14-35)
$\begin{matrix} [φ_{j}, H] + \sum_{m = 1}^{M} u^{(m)} [φ_{j}, φ_{m}] \approx 0 & (3.14 - 36) \end{matrix}$
We consider (3.14-36) as a set of non-homogeneous linear equations with M≦J unknowns with coefficients that are functions of the q's and p's.
The general solution of (3.14-36) for each j is of the form
u ^(m) =U ^(m) +V ^(m) ,m=1, . . . ,M (3.14-37)
with V^(m)the solution of the homogeneous equation
$\begin{matrix} \sum_{m = 1}^{M} V^{(m)} [φ_{j}, φ_{m}] \approx 0 & (3.14 - 38) \end{matrix}$
The most general solution of (3.14-38) is a linear combination of linearly independent solutions of V_α ^(m)where α=1, . . . , A with A≦M. Under the assumption that the matrix
$\begin{matrix} [\begin{matrix} [φ_{1}, φ_{1}] & \dots & [φ_{1}, φ_{M}] \\ ⋮ & ⋰ & ⋮ \\ [φ_{J}, φ_{1}] & \dots & [φ_{J}, φ_{M}] \end{matrix}] & (3.14 - 39) \end{matrix}$
is of constant rank, the number of independent solutions A is the same for all p, q. Thus the general solution to (3.14-36) can be written as
$\begin{matrix} u^{(m)} \approx U^{(m)} + \sum_{α = 1}^{A} V^{(α)} V_{α}^{(m)}, m = 1, \dots, M & (3.14 - 40) \end{matrix}$
Irreducible and Reducible Cases
If the equations {φ_j=0|j=1, . . . , J} are not independent, one says that the constraints are reducible. The system is irreducible when the constraints are independent. However the separation of constraints into dependent and independent ones might be difficult to perform. It also may disturb invariance properties under some important symmetry. In some cases it may be impossible to separate irreducible from irreducible contexts. Reducible cases arise for example when the dynamical coordinates include p-form gauge fields.
Any irreducible set of constraints can always be replaced by a reducible set by introducing constraints of the ones already at hand. The formalism should be invariant under such replacements.
Total Hamiltonian
We now discuss details of the dynamic equation (3.14-25)
$\begin{matrix} \dot{F} \approx [F, H^{'} + \sum_{α = 1}^{A} V^{(α)} φ_{α}] & (3.14 - 41) \end{matrix}$
where from (3.14-40)
$\begin{matrix} H^{'} = H + \sum_{m = 1}^{M} U^{(m)} φ_{m} and φ_{α} = \sum_{m = 1}^{M} V_{α}^{(m)} φ_{m}, α = 1, \dots, A & (3.14 - 42) \end{matrix}$
This is the result of theorem 3 (see below).
Theorem 3.
$\begin{matrix} [F, \sum_{m = 1}^{M} U^{(m)} φ_{m}] ≃ \sum_{m = 1}^{M} U^{(m)} [F, φ_{m}] & (3.14 - 43) \\ [F, \sum_{α = 1}^{A} V_{α}^{(m)} φ_{m}] ≃ \sum_{α = 1}^{A} V_{α}^{(m)} [F, φ_{m}] & (3.14 - 44) \end{matrix}$

Proof.

$\begin{matrix} [F, \sum_{m = 1}^{M} U^{(m)} φ_{m}] = \sum_{i = 1}^{N} {\frac{\partial F}{\partial q^{(i)}} \frac{\partial}{\partial p_{i}} \sum_{m = 1}^{M} U^{(m)} φ_{m} - \frac{\partial F}{\partial p_{i}} \frac{\partial}{\partial q^{(i)}} \sum_{m = 1}^{M} U^{(m)} φ_{m}} = \sum_{i = 1}^{N} {\frac{\partial F}{\partial q^{(i)}} [\sum_{m = 1}^{M} \frac{\partial U^{(m)}}{\partial p_{i}} φ_{m} + \sum_{m = 1}^{M} U^{(m)} \frac{\partial φ_{m}}{\partial p_{i}}]} - \sum_{i = 1}^{N} {\frac{\partial F}{\partial p_{i}} [\sum_{m = 1}^{M} \frac{\partial U^{(m)}}{\partial q^{(i)}} φ_{m} + \sum_{m = 1}^{M} U^{(m)} \frac{\partial φ_{m}}{\partial q^{(i)}}]} = \sum_{m = 1}^{M} {[F, U^{(m)}] φ_{m} + U^{(m)} [F, φ_{m}]} So [F, \sum_{m = 1}^{M} U^{(m)} φ_{m}] - \sum_{m = 1}^{M} U^{(m)} [F, φ_{m}] = \sum_{m = 1}^{M} {F, U^{(m)}] φ_{m} & (3.14 - 45) \end{matrix}$
and from (3.14-34) in (3.14-45), (3.14-43) follows. By a similar process we show (3.14-44). We now prove the validity of (3.14-41).
Theorem 4.
Let F(q, p) be a regular function, then F(p, q) propagates in time according to the approximate equation (3.14-41).

Proof.

From (3.14-25),

$\begin{matrix} \frac{d F}{d t} = [F, H] + \sum_{m = 1}^{M} u^{(m)} [F, φ_{m}] . & (3.14 - 46) \end{matrix}$
From (3.14-40) into (3.14-46) we obtain,
$\begin{matrix} \frac{d F}{dt} \approx [F, H] + \sum_{m = 1}^{M} {U^{(m)} + \sum_{α = 1}^{A} V^{(α)} V_{α}^{(m)}} [F, φ_{m}] or \frac{d F}{dt} \approx [F, H] + \sum_{m = 1}^{M} U^{(m)} [F, φ_{m}] + \sum_{m = 1}^{M} \sum_{α = 1}^{A} V^{(α)} V_{α}^{(m)} [F, φ_{m}] & (3.14 - 47) \end{matrix}$
Thus from (3.14-43) and (3.14-44) of theorem 3, we get
$\begin{matrix} \frac{d F}{d t} \approx [F, H] + \sum_{m = 1}^{M} [F, U^{(m)} φ_{m}] + \sum_{α = 1}^{A} v^{(α)} [F, \sum_{m = 1}^{M} V_{α}^{(m)} φ_{m}] \approx [F, H + \sum_{m = 1}^{M} U^{(m)} φ_{m} + \sum_{α = 1}^{A} v^{(α)} \sum_{m = 1}^{M} V_{α}^{(m)} φ_{m}] \approx [F, H^{'} + \sum_{m = 1}^{M} U^{(m)} φ_{m} + \sum_{α = 1}^{A} v^{(α)} φ_{α}] & (3.14 - 48) \\ H^{'} = H + \sum_{m = 1}^{M} U^{(m)} φ_{m} & (3.14 - 49) \\ φ_{α} = \sum_{m = 1}^{M} V_{α}^{(m)} φ_{m} & (3.14 - 50) \end{matrix}$
Now define
$\begin{matrix} H_{T} = H^{'} + \sum_{α = 1}^{A} v^{(α)} φ_{α} . & (3.14 - 51) \end{matrix}$
So we obtain
$\begin{matrix} \frac{dF}{dt} \approx [F, H_{T}] & (3.14 - 52) \end{matrix}$
First and Second Class Functions
The distinction between primary and secondary constraints is of little importance. We now consider a fundamental classification. It depends on the concept of first class and second class functions.
Definition 1.
A function F(q, p) is said to be first class if its Poisson bracket with every constraint vanishes weakly, [F, φ_j]≈0, j=1, . . . , J. A function of the canonical variables that is not first class is called second class. Thus F is second class if [F, φ_k]≠0 for at least one k, k=1, . . . , M.
Theorem 5.
If F and G are first class functions, then their Poisson bracket is also a first class function.
Proof:
By Hypothesis,
$\begin{matrix} [F, φ_{j}] = \sum_{k = 1}^{M} f_{j}^{(k)} φ_{k} & (3.14 - 53) \\ [G, φ_{j}] = \sum_{l = 1}^{M} g_{j}^{(l)} φ_{l} & (3.14 - 54) \end{matrix}$
Applying the Jacobi identity, we get
$\begin{matrix} [[F, G], φ_{j}] = [F, [G, φ_{j}]] - [G, [F, φ_{j}]] \\ = [F, \sum_{l = 1}^{M} _{j}^{(l)} φ_{l}] - [G, \sum_{k = 1}^{M} f_{j}^{(k)} φ_{k}] \\ = \sum_{i} {\frac{\partial F}{\partial q^{(i)}} \frac{\partial}{\partial p_{i}} \sum_{l = 1}^{M} _{j}^{(l)} φ_{l} - \frac{\partial F}{\partial p_{i}} \frac{\partial}{\partial q^{(i)}} \sum_{l = 1}^{M} _{j}^{(l)} φ_{l}} - \\ \sum_{n} {\frac{\partial G}{\partial q^{(n)}} \frac{\partial}{\partial p_{n}} \sum_{k = 1}^{M} f_{j}^{(k)} φ_{k} - \frac{\partial G}{\partial p_{n}} \frac{\partial}{\partial q^{(n)}} \sum_{k = 1}^{M} f_{j}^{(k)} φ_{k}} \\ \sum_{i} {\frac{\partial F}{\partial q^{(i)}} \sum_{l = 1}^{M} {\frac{\partial g_{j}^{(l)}}{\partial p_{i}} φ_{l} + _{j}^{(l)} \frac{\partial φ_{l}}{\partial p_{i}}} - \\ \frac{\partial F}{\partial p_{i}} \sum_{l = 1}^{M} {\frac{\partial _{j}^{(l)}}{\partial q^{(i)}} φ_{l} + _{j}^{(l)} \frac{\partial φ_{l}}{\partial q^{(i)}}}} - \\ = \sum_{n} {\frac{\partial G}{\partial q^{(n)}} \sum_{k = 1}^{M} {\frac{\partial f_{j}^{(k)}}{\partial p_{n}} φ_{k} + f_{j}^{(k)} \frac{\partial φ_{k}}{\partial p_{n}}} - \\ \frac{\partial G}{\partial p_{n}} \sum_{k = 1}^{M} {\frac{\partial f_{j}^{(k)}}{\partial q^{(n)}} φ_{k} + f_{j}^{(k)} \frac{\partial φ_{k}}{\partial q^{(n)}}}} \\ \sum_{l = 1}^{M} {φ_{i} \sum_{i} {\frac{\partial F}{\partial q^{(i)}} \frac{\partial _{j}^{(l)}}{\partial p_{i}} - \frac{\partial F}{\partial p_{i}} \frac{\partial _{j}^{(l)}}{\partial q^{(i)}}} + \\ _{j}^{(l)} \sum_{i} {\frac{\partial F}{\partial q^{(i)}} \frac{\partial φ_{l}}{\partial q^{(i)}} - \frac{\partial F}{\partial p_{i}} \frac{\partial φ_{l}}{\partial p_{i}}}} - \\ = \begin{matrix} \sum_{k = 1}^{M} {φ_{k} \sum_{n} {\frac{\partial G}{\partial q^{(n)}} \frac{\partial f_{j}^{(k)}}{\partial p_{n}} - \frac{\partial G}{\partial p_{n}} \frac{\partial f_{j}^{(k)}}{\partial p^{(n)}}} + \\ + f_{j}^{(k)} \sum_{n} {\frac{\partial G}{\partial q^{(n)}} \frac{\partial φ_{k}}{\partial p_{n}} - \frac{\partial G}{\partial p_{n}} \frac{\partial φ_{k}}{\partial q^{(n)}}}} \end{matrix} \\ = \begin{matrix} \sum_{l = 1}^{M} {φ_{l} [F, _{j}^{(l)}] + _{j}^{(l)} [F, φ_{l}]} - \\ \sum_{k = 1}^{M} {φ_{k} [G, f_{j}^{(k)}] + f_{j}^{(k)} [G, φ_{k}]} \end{matrix} \\ = \begin{matrix} \sum_{l = 1}^{M} [F, _{j}^{(l)}] φ_{l} - \sum_{k = 1}^{M} [G, f_{j}^{(k)}] φ_{k} + \\ \sum_{l^{'} = 1}^{M} {\sum_{l = 1}^{M} _{j}^{(l)} f_{l}^{(l^{'})}} φ_{l^{'}} - \sum_{k^{'} = 1}^{M} {\sum_{k = 1}^{M} f_{j}^{(k)} _{k^{'}}^{k}} φ_{k^{'}} \approx 0 \end{matrix} \end{matrix}$
We now use theorem 5 to show the following.
Theorem 6.
H′ defined by (3.14-49) and φ_αdefined by (3.14-50) are first class functions.
Proof:
This follows directly from (3.14-36) and (3.14-38).
We learn from theorem 6 that the total Hamiltonian defined by (3.14-51) is the sum of the first class Hamiltonian H′ and the first class primary constraints φ_αmultiplied by arbitrary coefficients.
First Class Constraints as Generators of Gauge Transformations
Gauge transformations are transformations that do not change the physical state.
The presence of arbitrary functions of time v^(α), α=1, . . . , A in the total Hamiltonian, H_T(see (3.14-51)) imply that not all the q's and p's are observable given a set of q's and p's where the state of the physical system is uniquely determined. However the converse is not true: there is more than one set of values of the canonical variables that defines a state. To illustrate this, we see that if we give an initial set of values of physical state at time t, we expect the equations of motion to fully determine the state at other times. Thus any ambiguity in the value of the canonical variables at t₂≠t₁should be irrelevant from the physical point of view.

A Derivation Example

We propose here an alternate formulation of Dirac's formalism.
Primary Constraints
Recall that the momenta, canonically conjugate to the generalized “coordinates” q^(j), j=1, . . . , N is given by
$\begin{matrix} p_{j} = \frac{\partial L (q, \dot{q})}{\partial {\dot{q}}^{(j)}}, j = 1, \dots, N . & (E - 1) \end{matrix}$
For non-singular systems the equations allows us to express {dot over (q)}^(j), j=1, . . . , N in terms of the canonical variables,
{dot over (q)} ⁽ⁱ⁾ =f _i(q,p),i=1, . . . ,N (E-2)
By performing a Legendre transformation
$H_{c} (p, q) = \sum_{i = 1}^{N} p_{i} f (q, p) + L (q, f (p, q))$
We obtain the Hamiltonian of the system H_c. And from this function we obtain the standard equations of motion of the system.
$\begin{matrix} \begin{matrix} \dot{q} = \frac{\partial H_{c}}{\partial p} \\ \dot{p} = - \frac{\partial H_{c}}{\partial q} \end{matrix} & (E - 3) \end{matrix}$
For (E-2) to be well-defined we need to have the Hessian W of satisfy
detW≠0 (E-4)
In this case the accelerations {umlaut over (q)}⁽ⁱ⁾are uniquely determined by the q^(j)and {dot over (q)}⁽ⁱ⁾.
When det W≠0, the Hamiltonian equations of motion do not take the standard form, and we speak of a singular Lagrangian. For illustration purposes, consider a Lagrangian of the form
$\begin{matrix} L (q, \dot{q}) = \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = 1}^{N} W_{ij} (q) {\dot{q}}^{(i)} {\dot{q}}^{(j)} + \sum_{i = 1}^{N} η_{i} (q) {\dot{q}}^{(i)} - V (q) & (E - 5) \end{matrix}$
with W a symmetric matrix. From (E-1), the canonical momentum for (E-5) is given by
$\begin{matrix} p_{i} = \frac{1}{2} \sum_{j = 1}^{N} W_{ij} (q) {\dot{q}}^{(j)} + η_{i} (q), i = 1, \dots, n . & (E - 6) \end{matrix}$
If W is singular of rank R_W, then it possesses N−R_Weigenvectors with corresponding zero eigenvalues. Then for eigenvectors v_j ^(α)
$\sum_{j = 1}^{N} W_{ij} (q) v_{j}^{(α)} (q) = 0, α = 1, \dots, N - R_{w}$
So pre-multiplying (E-6) by v_j ^(α)and summing over i we get
$\begin{matrix} \begin{matrix} \sum_{i = 1}^{N} v_{i}^{(α)} (q) p_{i} = \sum_{i = 1}^{N} [\sum_{j = 1}^{N} (v_{i}^{(α)} (q) W_{ij} (q) {\dot{q}}^{(j)}) + v_{i}^{(α)} (q) η_{i} (q)] \\ = \sum_{i = 1}^{N} v_{i}^{(α)} (q) η_{i} (q), α = 1, \dots, N - R_{w} \end{matrix} & (E - 7) \\ So \\ \sum_{i = 1}^{N} v_{i}^{(α)} (q) (p_{i} - η_{i} (q)) = 0, α = 1, \dots, N - R_{w} . \end{matrix}$
Let {p_α}, α=1, . . . , N−R_W, denote the linearly dependent elements of p. Let {p_α}, a=1, . . . , R_abe the momenta satisfying (E-1). Then the constraint equations are of the form
$\begin{matrix} \sum_{β = 1}^{N - R_{w}} M_{α β} (q) p_{β} - F_{α} (q, {p_{a}}) = 0, α = 1, \dots, N - R_{w} & (E - 8) \\ and \\ M_{α β} (q) = v_{β}^{(α)} \\ F_{α} (q, {p_{β}}) = \sum_{i = 1}^{N} v_{i}^{(α)} (q) η_{i} (q) + \sum_{b = 1}^{R_{w}} v_{b}^{(α)} (q) p_{b} & (E - 9) \end{matrix}$
The matrix {M_αβ} is necessarily invertible because otherwise M would possess eigenvectors with zero eigenvalues, implying existence of additional constraints. Note that (E-8) can be written as
p _α −g _α(q,{p _a})=0,α=1, . . . ,N−R _W
with
$\begin{matrix} g_{α} (q, {p_{a}}) = \sum_{β = 1}^{N - R_{w}} M_{α β}^{- 1} F_{β} (q, {p_{a}}) & (E - 10) \end{matrix}$
with dim{p_a}=R_W. So we can define,
φ_α(q,p)=p _α −g _α(q,{p _α})=0,α=1, . . . ,N−R _W (E-11)
In Dirac's terminology, constraints of the form of (E-11) are referred to as primary constraints. Although the derivation above is based on a Lagrangian, quadratic in the velocity terms, it is generally valid for Lagrangians which depend on q and {dot over (q)} but not on higher derivatives.
Note: Primary constraints follow exclusively from the definition of canonical momenta.
The derivation above is valid for general Lagrangians and their Hessian. Let's assume {W_ij(q, {dot over (q)})} is the Hessian of a given Lagrangian L. Let {W_ab|a, b=1, . . . , R_W} be the largest sub-matrix of {W_ij} with suitable rearrangement if necessary. We then solve (E-1) for R_Wvelocities {dot over (q)}^(a)in terms of {q⁽ⁱ⁾|i=1, . . . , n}, {p_a|a=1, . . . , R_W} and {q^(α)|α=1, . . . N−R_W}. That is
{dot over (q)} ^(a) =f _a(q,{P _b },{{dot over (q)} ^(β)}) (E-12)
with a, b=1, . . . , R_Wand β=R_W+1, . . . , N.
Inserting these relations into (E-1), we get relations of the form
p _j =h _j(q,{p _a },{{dot over (q)} ^(α)}) (E-13)
with a, j=1, . . . , R_Wand α=R_W+1, . . . , N. This relation reduces to an identity by construction. The remaining equations are of the form
p _α =h _α(q,{p _α },{{dot over (q)}(β)}) (E-14)
with α=1, . . . , N−R_W. However, the right hand side cannot depend on {q(O} since otherwise we could express more velocities in terms of the momenta of the coordinates of the momenta and the remaining velocities.
Hamiltonian Equations of Motion for Constrained Systems
Theorem 3.16.1.
In the space Γ_pdefine by Γ_p={φ_α(p, q)|α=1, . . . , N−R_W} where φ_αis defined as (E-11). The Hamiltonian is only a function of {q⁽ⁱ⁾|i=1, . . . , N} and momenta {p_a|a=1, . . . , R_W} and does not depend on {{dot over (q)}^(α)|α=1, . . . , N−R_W}
Proof.
On Γ_pthe Hamiltonian is given by
$\begin{matrix} H_{o} = H_{c}  Γ_{p} = \sum_{a = 1}^{R_{w}} p_{a} f_{a} - \sum_{α = 1}^{N - R_{w}} f_{α} {\dot{q}}^{(α)} - L (q, {f_{b}}, {{\dot{q}}^{(β)}}) & (E - 15) \end{matrix}$
where f_a, a=1, . . . , N−R_Wis given by (E-12) and g_α, α=1, . . . , R_Wis given by (E-10). We want to show that H_odoes not depend on {dot over (q)}^(β), β=1, . . . , N−R_W. We compute
$\begin{matrix} (E - 16) \begin{matrix} \frac{\partial H_{o}}{\partial {\dot{q}}^{(β)}} = \sum_{a = 1}^{R_{w}} p_{a} \frac{\partial f_{a}}{\partial {\dot{q}}^{(β)}} - g_{β} - \sum_{a = 1}^{R_{w}} \frac{\partial L}{\partial {\dot{q}}^{(a)}} {\langle_{{\dot{q}}^{(a)} = f_{a}} \frac{\partial f_{a}}{\partial {\dot{q}}^{(β)}} - \frac{\partial L}{\partial {\dot{q}}^{(β)}} \rangle}_{{\dot{q}}^{(a)} = f_{a}} \\ = \sum_{a = 1}^{R_{w}} {(p_{a} - \frac{\partial L}{\partial {\dot{q}}^{(a)}} \langle_{{\dot{q}}^{(a)} = f_{a}}) \frac{\partial f_{a}}{\partial {\dot{q}}^{(β)}} - _{β} - \frac{\partial L}{\partial {\dot{q}}^{(β)}} \rangle}_{{\dot{q}}^{(a)} = f_{a}} \end{matrix} \end{matrix}$
Since by definition
$p_{a} = \frac{\partial L}{\partial {\dot{q}}^{(a)}}, a = 1, \dots, R_{w}$
And from (E-11)
$\begin{matrix} g_{β} = p_{β} = \frac{\partial L}{\partial {\dot{q}}^{(β)}} _{{\dot{q}}_{a} = f_{a}} . & (E - 17) \\ So \\ \frac{\partial H_{o}}{\partial {\dot{q}}^{(β)}} = 0, β = 1, \dots, N - R_{w} . \end{matrix}$
and therefore
H _o(q,{p _a },{{dot over (q)} ^(a)})=H _o(q,{p _a}).
Theorem 3.16.2.
In the presence of primary constraints (E-11), the Hamilton equations of motion are given by
$\begin{matrix} \begin{matrix} {\dot{q}}^{(i)} = \frac{\partial H_{o}}{\partial p_{i}} + \sum_{β = 1}^{N} {\dot{q}}^{(β)} \frac{\partial φ_{β}}{\partial p_{i}}, & i = 1, \dots, N \\ {\dot{p}}_{i} = - \frac{\partial H_{o}}{\partial q^{(i)}} + \sum_{β = 1}^{n} {\dot{q}}^{(β)} \frac{\partial φ_{β}}{\partial q^{(i)}}, & i = 1, \dots, N \\ φ_{α} (p, q) = 0, & α = 1, \dots, N - R_{W} \end{matrix} & (E - 18) \end{matrix}$
where {dot over (q)}^(β)are a priori underdetermined velocities.
Proof: From (E-15) we obtain and the application of Theorem 3.16.1
$\begin{matrix} \begin{matrix} \frac{\partial H_{o}}{\partial p_{a}} = f_{a} + \sum_{b = 1}^{R_{w}} p_{b} \frac{\partial f_{b}}{\partial p_{a}} + \sum_{β = 1}^{N - R_{w}} \frac{\partial g_{β}}{\partial p_{a}} {\dot{q}}^{(β)} - \sum_{b = 1}^{R_{w}} \frac{\partial L}{\partial {\dot{q}}^{(b)}} \frac{\partial f_{b}}{\partial p_{a}} \\ = {\dot{q}}^{(a)} + \sum_{b = 1}^{R_{w}} (p_{b} - \frac{\partial L}{\partial {\dot{q}}^{(b)}}) \frac{\partial f_{b}}{\partial p_{a}} + \sum_{β = 1}^{N - R_{w}} \frac{\partial _{β}}{\partial p_{a}} {\dot{q}}^{(β)} \\ = {\dot{q}}^{(a)} + \sum_{β = 1}^{N - R_{w}} \frac{\partial _{β}}{\partial p_{a}} {\dot{q}}^{(β)} \end{matrix} & (E - 19) \end{matrix}$
with a=1, . . . , n−R_W. Further
$\begin{matrix} \begin{matrix} \frac{\partial H_{o}}{\partial q^{(i)}} = \begin{matrix} \sum_{b = 1}^{R_{w}} p_{b} \frac{\partial f_{b}}{\partial q^{(i)}} + \sum_{β = 1}^{N - R_{w}} {\dot{q}}^{(β)} \frac{\partial g_{β}}{\partial q^{(i)}} - \\ \frac{\partial L}{\partial q^{(i)}} {\langle_{{\dot{q}}_{a} = f_{a}} - \sum_{b = 1}^{R_{w}} \frac{\partial L}{\partial {\dot{q}}_{b}} \rangle}_{{\dot{q}}_{b} = f_{b}} \frac{\partial f_{b}}{\partial q^{(i)}} \end{matrix} \\ = \begin{matrix} \sum_{b = 1}^{R_{w}} (p_{b} - \frac{\partial L}{\partial {\dot{q}}^{(b)}} _{{\dot{q}}^{(b) = f_{b}}}) \frac{\partial f_{b}}{\partial q^{(i)}} + \\ \sum_{β = 1}^{N - R_{w}} {\dot{q}}^{(β)} \frac{\partial _{β}}{\partial q^{(i)}} - \frac{\partial L}{\partial q^{(i)}} _{{\dot{q}}^{(a)} = f_{a}} \end{matrix} \\ = \sum_{β = 1}^{N - R_{w}} {\dot{q}}^{(β)} \frac{\partial _{β}}{\partial q^{(i)}} - \frac{\partial L}{\partial q^{(i)}} _{{\dot{q}}^{(a)} = f_{a}} \\ = \sum_{β = 1}^{N - R_{w}} {\dot{q}}^{(β)} \frac{\partial _{β}}{\partial q^{(i)}} - \frac{d}{dt} (\frac{\partial l}{\partial {\dot{q}}^{(i)}}) _{{\dot{q}}^{(a)} = f_{a}} \end{matrix} & (E - 20) \end{matrix}$
from (add reference).
$\begin{matrix} \frac{\partial H_{o}}{\partial q^{(i)}} = - {\dot{p}}_{i} + \sum_{β = 1}^{N - R_{w}} {\dot{q}}^{(β)} \frac{\partial g_{β}}{\partial q^{(i)}} & (E - 21) \end{matrix}$

From (E-19) and (E-20) we get:

$\begin{matrix} {\dot{q}}^{(a)} = \frac{\partial H_{o}}{\partial p_{a}} - \sum_{β = 1}^{N - R_{w}} \frac{\partial g_{β}}{\partial p_{a}} {\dot{q}}^{(β)}, a = 1, \dots, R_{w} {\dot{p}}_{i} = - \frac{\partial H_{o}}{\partial q^{(i)}} + \sum_{β = 1}^{n - R_{w}} {\dot{q}}^{(β)} \frac{\partial _{β}}{\partial q^{(i)}}, i = 1, \dots, N & (E - 22) \end{matrix}$

Since

$\frac{\partial H_{o}}{\partial p_{α}} = 0 and \frac{\partial φ_{β}}{\partial p_{a}} = {δβ}_{α}$
we can supplement these equations with
$\begin{matrix} {\dot{q}}^{(α)} = \frac{\partial H_{o}}{\partial p_{α}} - \sum_{β = 1}^{N - R_{w}} \frac{\partial g_{β}}{\partial p_{α}} {\dot{q}}^{(β)}, α = 1, \dots, N - R_{w} & (E - 23) \end{matrix}$
So we can write
$\begin{matrix} {\dot{q}}^{(i)} = \frac{\partial H_{o}}{\partial p_{i}} + \sum_{β = 1}^{N - R_{w}} \frac{\partial g_{β}}{\partial p_{i}} {\dot{q}}^{(β)}, i = 1, \dots, N {\dot{p}}_{i} = - \frac{\partial H_{o}}{\partial q^{(i)}} - \sum_{β = 1}^{N - R_{w}} {\dot{q}}^{(β)} \frac{\partial _{β}}{\partial q^{(i)}}, i = 1, \dots, N & (E - 24) \end{matrix}$
For consistency with (E-11) we should write
$\begin{matrix} {\dot{q}}^{(α)} = \frac{d}{dt} - g_{α} (q, {p_{a}}), α = 1, \dots, N - R_{w} & (E - 25) \end{matrix}$
where {dot over (p)}_αis given by the right hand side of (E-22).
Streamlining the Hamiltonian Equation of Motion (EOM)
Definition 3.16-1.
A function f is weakly equal to g denoted by f≈g, if f and g are equal on the subspace defined by the primary constraints,
φ_β=0 when f| _Γ _p =g| _Γ _p
and
f(q,p)≈g(q,p)
f(q,p)=g(q,p) when {φ_α(q,p)=0}
Theorem 3.16.3.
Assume f, g are defined over the entire space spanned by {q⁽ⁱ⁾}, {p_i}. Then if
f(q,p)|_Γ _p =g(q,p)|_Γ _p (E-26)

Then

$\begin{matrix} \frac{\partial}{\partial q^{(i)}} (f - \sum_{β} φ_{β} \frac{\partial f}{\partial p_{β}}) ≃ \frac{\partial}{\partial q^{(i)}} (h - \sum_{β} φ_{β} \frac{\partial h}{\partial p_{β}}) & (E - 27) \\ and \\ \frac{\partial}{\partial p_{i}} (f - \sum_{β} φ_{β} \frac{\partial f}{\partial p_{β}}) ≃ \frac{\partial}{\partial p} (h - \sum_{β} φ_{β} \frac{\partial h}{\partial p_{β}}) \end{matrix}$
for i=1, . . . N.
Proof: Consider the two functions f(q, {p_a},{p_β}) and h(q,{p_a},{p_β}). Using (E-11) and from the hypothesis of the theorem,
f(q,{p _a },{g _α})=h(q,{p _a },{g _α}) (E-28)
Thus is follows
$\begin{matrix} \begin{matrix} {(\frac{\partial f}{\partial q^{(i)}} + \sum_{a} \frac{\partial f}{\partial p_{a}} \frac{\partial p_{a}}{\partial q^{(i)}} + \sum_{β} \frac{\partial f}{\partial p_{β}} \frac{\partial g_{β}}{\partial q^{(i)}})}_{Γ_{p}} = \\ {(\frac{\partial h}{\partial q^{(i)}} + \sum_{a} \frac{\partial h}{\partial p_{a}} \frac{\partial p_{a}}{\partial q^{(i)}} + \sum_{β} \frac{\partial h}{\partial p_{β}} \frac{\partial g_{β}}{\partial q^{(i)}})}_{Γ_{p}} \end{matrix} & (E - 29) \\ and \\ \begin{matrix} {(\frac{\partial f}{\partial p_{i}} + \sum_{a \neq i} \frac{\partial f}{\partial p_{a}} \frac{\partial p_{a}}{\partial p_{i}} + \sum_{β} \frac{\partial f}{\partial p_{β}} \frac{\partial g_{β}}{\partial p_{i}})}_{Γ_{β}} = \\ {(\frac{\partial h}{\partial p_{i}} + \sum_{a \neq i} \frac{\partial h}{\partial p_{a}} \frac{\partial p_{a}}{\partial p_{i}} + \sum_{β} \frac{\partial h}{\partial p_{β}} \frac{\partial p_{β}}{\partial p_{i}})}_{Γ_{β}} \end{matrix} & (E - 30) \end{matrix}$
Note since φ_α(q,p)=p_α−g_α(q, {p_a}), we have
$\begin{matrix} \frac{\partial g_{β}}{\partial q^{(i)}} = - \frac{\partial φ_{β} (q, p)}{\partial q^{(i)}} \\ and \\ \frac{\partial g_{β}}{\partial p_{i}} = - \frac{\partial φ_{β} (q, p)}{\partial p_{i}} \end{matrix}$
and
∂φ_α(q,p)=0
for α=1, . . . N−R_W. We have
${(\frac{\partial f}{\partial q^{(i)}} - \sum_{β} \frac{\partial f}{\partial p_{β}} \frac{\partial φ_{β}}{\partial q^{(i)}})}_{Γ_{p}} = {(\frac{\partial h}{\partial q^{(i)}} - \sum_{β} \frac{\partial h}{\partial p_{β}} \frac{\partial φ_{β}}{\partial q^{(i)}})}_{Γ_{β}}$
which can be written as
$\frac{\partial}{\partial q^{(i)}} (f - \sum_{β} φ_{β} \frac{\partial f}{\partial p_{β}}) ≃ \frac{\partial}{\partial q^{(i)}} (h - \sum_{β} φ_{β} \frac{\partial h}{\partial p_{β}})$
since φ_β∂²f/∂p_β ²=0 because φ_β=0. Similarly,
$\frac{\partial}{\partial p_{i}} (f - \sum_{β} φ_{β} \frac{\partial f}{\partial p_{β}}) ≃ \frac{\partial}{\partial q^{(i)}} (h - \sum_{β} φ_{β} \frac{\partial h}{\partial p_{β}})$
Corrolary 3.16-1.
${\dot{q}}^{(i)} = \frac{\partial H}{\partial p_{i}} + \sum_{β} v^{(β)} \frac{\partial φ_{β}}{\partial p_{i}}$ ${\dot{p}}_{i} = - \frac{\partial H}{\partial q^{(i)}} - \sum_{β} v^{(β)} \frac{\partial φ_{β}}{\partial q^{(i)}}$
for i=1, . . . , N.
Proof.
We consider two Hamiltonians H({q⁽ⁱ⁾},{p_i}) and H_o({q⁽ⁱ⁾},{p_a}). Define H({q⁽ⁱ⁾},{p_i}) as follows
H({q ⁽ⁱ⁾ },{p _i})≈H _o({q ⁽ⁱ⁾ },{p _a}).
Then using the result of Theorem 3.16.1, from (E-29) with f=H and h=H_o
$\begin{matrix} \frac{\partial H_{o}}{\partial q^{(i)}} \approx \frac{\partial}{\partial q^{(i)}} (H - \sum_{β = 1}^{N - R_{W}} φ_{β} \frac{\partial H}{\partial p_{β}}) & (E - 31) \\ \frac{\partial H_{o}}{\partial p_{i}} \approx \frac{\partial}{\partial p_{i}} (H - \sum_{β = 1}^{N - R_{W}} φ_{β} \frac{\partial H}{\partial p_{β}}) & (E - 32) \end{matrix}$

Using (E-31) and (E-32) in (E-24), we get

$\begin{matrix} {\dot{q}}^{(i)} = \frac{\partial}{\partial p_{i}} (H - \sum_{β} φ_{β} \frac{\partial H}{\partial p_{β}}) + \sum_{β} {\dot{q}}^{(β)} \frac{\partial φ_{β}}{\partial p_{i}} and {\dot{p}}_{i} \approx - \frac{\partial}{\partial q^{(i)}} (H - \sum_{β} φ_{β} \frac{\partial H}{\partial p_{β}}) - \sum_{β} {\dot{q}}^{(β)} \frac{\partial φ_{β}}{\partial q^{(i)}} or {\dot{q}}^{(i)} \approx \frac{\partial}{\partial p_{i}} (H - \sum_{β} φ_{β} (\frac{\partial H}{\partial p_{β}} - {\dot{q}}^{(β)})) and {\dot{p}}_{i} \approx - \frac{\partial}{\partial q^{(i)}} (H - \sum_{β} φ_{β} (\frac{\partial H}{\partial p_{β}} - {\dot{q}}^{(β)})) & (E - 33) \end{matrix}$

Define

$v_{β} \equiv {\dot{q}}^{(β)} - \frac{\partial H}{\partial p_{β}}$ $H_{T} \equiv H + \sum_{β} v^{(β)} φ_{β}$
So (E-33) becomes
$\begin{matrix} {\dot{q}}^{(i)} \approx \frac{\partial H_{T}}{\partial p_{i}} {\dot{p}}_{i} \approx - \frac{\partial H_{T}}{\partial q^{(i)}} & (E - 34) \end{matrix}$
Constrained Hamiltonian Systems
Local symmetries on a Lagrangian based model. Consider
q ⁽ⁱ⁾ →q ⁽ⁱ⁾(t)+δq ⁽ⁱ⁾(t)
{dot over (q)} ⁽ⁱ⁾ →{dot over (q)} ⁽ⁱ⁾(t)+δ{dot over (q)} ⁽ⁱ⁾(t)
with i=1, . . . , N. The action of the system is given by
S(q,{dot over (q)})=∫L(q,{dot over (q)})dt
where q and {dot over (q)} are n-dimensional column vectors. The action differential
$\begin{matrix} δ S = \int L (q + δ q, \dot{q} + δ \dot{q}) dt - \int L (q, \dot{q}) dt \\ = \int L (q + δ q, \dot{q} + δ \dot{q}) dt - \int L (q, \dot{q}) dt \\ = \int [\sum_{i} \frac{\partial L}{\partial q^{(i)}} δ q^{(i)} + \sum_{i} \frac{\partial L}{\partial {\dot{q}}^{(i)}} δ {\dot{q}}^{(i)}] dt \\ = - \int \sum_{i} [\frac{d}{dt} \frac{\partial L}{\partial {\dot{q}}^{(i)}} - \frac{\partial L}{\partial q^{(i)}}] δ q^{(i)} dt \\ = - \sum dt \sum_{i} E_{i}^{(o)} (q, \dot{q}, \ddot{q}) δ q^{(i)} \end{matrix}$
where we define the Euler-Lagrange differential operator
$\begin{matrix} E_{i}^{(o)} (q, \dot{q}, \ddot{q}) = \frac{d}{dt} \frac{\partial L}{\partial q^{(i)}} - \frac{\partial L}{\partial q^{(i)}} . Note that \int \sum_{i = 1}^{N} E_{i}^{(o)} (q, \dot{q}, \ddot{q}) δ^{(i)} dt \equiv 0 & (3.17 - 1) \end{matrix}$
on shell. Expanding E_i ^(o)
$\begin{matrix} E_{i}^{(o)} (q, \dot{q}, \ddot{q}) = \sum_{j} [\frac{\partial^{2} L (q, \dot{q})}{\partial {\dot{q}}^{(i)} \partial {\ddot{q}}^{(j)}} {\ddot{q}}^{(j)} + \frac{\partial^{2} L (q, \dot{q})}{\partial {\dot{q}}^{(i)} \partial q^{(j)}} {\dot{q}}^{(i)}] - \frac{\partial L (q, \dot{q})}{\partial q^{(i)}} \\ = \sum_{j} W_{ij} (q, \dot{q}) {\ddot{q}}^{(j)} + \sum_{j} \frac{\partial^{2} L (q, \dot{q})}{\partial {\dot{q}}^{(i)} \partial q^{(j)}} {\dot{q}}^{(i)} - \frac{\partial L (q, \dot{q})}{\partial q^{(i)}} \\ = \sum_{j} W_{ij} (q, \dot{q}) {\ddot{q}}^{(j)} + k_{i} (q, \dot{q}) \end{matrix}$
If L is singular, W_(N×N)is not invertible so (3.17-1) cannot be solved for {umlaut over (q)}_i, i=1, . . . , N. If Rank(W(q, {dot over (q)}))=R_Won shell, then there exist N−R_Win the theory. There exist N−R_Windependent left (or right) zero mode eigenvectors w_i ^(o,k). i=1, . . . N−R_Wsuch that
$\begin{matrix} \sum_{i} w_{i}^{(o, k)} (q, \dot{q}) W_{ij} (q, \dot{q}) = 0, k = 1, \dots, N - R_{W} & (3.17 - 2) \end{matrix}$

Thus

$φ^{(o, k)} = \sum_{i = 1}^{N} w_{i}^{(o, k)} (q, \dot{q}) E_{i}^{(o)} (q, \dot{q}, \ddot{q})$
depend on q and {dot over (q)} only. The φ^(o,k)also vanish on shell:
φ^(o,k)(q,{dot over (q)})=0,k=1, . . . N−R _W
The set {φ^(o,k)|k=1, . . . , N−R_W} are the zero generation constraints. It is possible that not all the {φ^(o,k)} are linearly independent. So we may find linear combinations of the zero mode eigenvectors)
$v_{i}^{(o, n_{o})} = \sum_{k} c_{k}^{(n_{o})} w_{i}^{(o, k)}$
such that we have
G ^(o,n ^o ⁾ =v ^(o,n ^o ⁾ E ^(o)≡0, n _o , . . . ,N _o (3.17-3)
These are called gauge identities.
Any variation δq_i, i=1, . . . N, of the form
$δ q_{i} = \sum_{n_{o}} ɛ_{n_{o}} v_{i}^{(o, n_{o})}$
Is action invariant by (3.17-1).
Given this definition of δq_iand (3.17-3), we conclude
$\begin{matrix} δ S = \int dt \sum_{i = 1}^{N} E_{i}^{(o)} (q, q, q) \sum_{n_{o}} ɛ_{n_{o}} (t) v_{i}^{(o, n_{o})} \\ = \int dt \sum_{i = 1}^{N} ɛ_{n_{o}} \sum_{n_{o}} E_{i}^{(o)} (q, \dot{q}, \ddot{q}) v_{i}^{(o, n_{o})} (q, \dot{q}) \\ = \int dt \sum_{i = 1}^{N} ɛ_{n_{o}} G^{(o, n_{o})} \\ \equiv 0 \end{matrix}$
everywhere. The remaining zero generating modes which we denote by u^(o,n ^o ⁾lead to genuine constraints. They are of the form φ^(o,n ^o ⁾(q, {dot over (q)})=0 on shell, where
φ^(o,n ^o ⁾ =u ^(o,n ^o ⁾ E ^(o). (3.17-4)
The algorithm now proceeds as follows. We separate the gauge identities 3.17-3) from the nontrivial constraints (3.17-4) and will list them separately. They will be used for determining local symmetry transformations.
Next we want to search for additional constraints. We do this by searching for further functions of the coordinates and velocities which vanish in the space of physical trajectories. To this effect consider the following N+N_ovector constructed from E^(o)and the time derivative of the constraints (3.17-4)
$\begin{matrix} [E^{(1)}] = [\begin{matrix} E^{(o)} \\ \frac{d}{dt} (u^{(o, 1)} E^{(o)}) \\ ⋮ \\ \frac{d}{dt} (u^{(o, n_{o})} E^{(o)}) \end{matrix}] = [\begin{matrix} E^{(o)} \\ \frac{d}{dt} φ^{(o)} \end{matrix}] & (3.17 - 5) \end{matrix}$
by construction. The constraint φ^(o)is valid for all time and therefore
$\frac{d}{dt} φ^{(o)} = 0$
on shell, but
$\begin{matrix} \frac{d φ^{(o, i)}}{dt} = \nabla \dot{q} (u^{(o, i)} E^{(o)}) \ddot{q} + \nabla q (u^{(o, i)} E^{(o)}) \dot{q} So [E_{i_{1}}^{(1)}] = \sum_{j = 1}^{n} W_{i}^{_{1} j} (q, \dot{q}) {\dot{q}}^{(j)} + k_{i_{1}}^{(1)} (q, \dot{q}) & (3.17 - 6) \end{matrix}$
where i₁=1, . . . , N+N_o, and
$\begin{matrix} [W_{i}^{_{1} i}] = [\begin{matrix} W^{(o)} \\ \nabla \dot{q} (u^{(o, i)} E^{(o)}) \\ ⋮ \\ \nabla \dot{q} (u^{(o, N_{o})} E^{(o)}) \end{matrix}] [k_{i_{1}}^{(1)}] = [\begin{matrix} k^{(o)} \\ \sum_{j} \frac{\partial}{\partial q^{(j)}} (u^{(o, i)} E^{(o)}) {\dot{q}}^{(j)} \\ ⋮ \\ \sum_{j} \frac{\partial}{\partial q^{(j)}} (u^{(o, N_{o})} E^{(o)}) {\dot{q}}^{(j)} \end{matrix}] & (3.17 - 7) \end{matrix}$
We next look for the zero modes of W⁽¹⁾. By construction, these zero modes include the o modes of the previous level. The gauge identities at level 1 are.
$\begin{matrix} G^{(1, n_{1})} = v^{(1, n_{1})} E^{1} - \sum_{n_{o} = 1}^{N_{o}} M_{n_{1} n_{o}}^{(1, 0)} (u^{(o, n_{o})} E^{(o)}) \equiv 0 & (3.17 - 8) \end{matrix}$
where n₁=1, . . . , N₁and the genuine constraints are of the form
φ^(1,n ⁱ ⁾=φ^(1,n ⁱ ⁾ E ¹=0 (3.17-9)
with n₁=1, . . . , N₁on shell.
We next adjoin the new identities (3.17-8) to the ones determined earlier 3.17-3) with the remaining constraints (3.17-9) we proceed as before, adjoining their time derivatives to (3.17-5) and construct W_i ₁ _i ⁽¹⁾and k_i ₁ ⁽¹⁾.
The iterative process will terminate at some level M if either i) there is not further zero modes, or ii) the new constraints can be expressed as linear combinations of previous constraints.
The Maximal Set of Linearly Independent Gauge Identities Generated by the Algorithm
Note that the algorithm steps are of the form)
G ^(o,n ^o ⁾ =u ^(o,n ^o ⁾ E ^(o)≡0 (3.17-10)
$\begin{matrix} G^{(l, n_{l})} = u^{(l, n_{l})} E^{(l)} - \sum_{l^{'} = 0}^{l - 1} \sum_{n_{l^{'}} = 0}^{N_{l^{'}}} M_{n_{l}, n_{l^{'}}}^{(l, l^{'})} φ^{(l^{'}, n_{l^{'}})} & (3.17 - 11) \end{matrix}$
with L=1, . . . , N₁. The M_n _l _n _l ^(l,l′)are only functions of q and {dot over (q)}. And
φ^(l,n ^l ⁾ =u ^(l,n ^l ⁾ E ^(l) , n _l=1, . . . ,N _l, (3.17-12)
$\begin{matrix} E^{(l)} = [\begin{matrix} E^{(o)} \\ \frac{d φ^{(o)}}{dt} \\ ⋮ \\ \frac{d φ^{(l - 1)}}{dt} \end{matrix}] & (3.17 - 13) \end{matrix}$
where φ^(l)is a column vector with N_lcomponents φ^(l,n ^l ⁾. Thus we conclude from (3.17-13) and (3.17-11) that the general form of the gauge identity given by (3.17-11) is of the form
$\begin{matrix} G^{(l, n_{l})} = \sum_{i = 1}^{N_{l}} \sum_{l = 1}^{M} \sum_{m = 1}^{l} ς_{mi}^{(l, m_{l})} \frac{d^{m}}{{dt}^{m}} E_{i}^{(o)} \equiv 0 & (3.17 - 14) \end{matrix}$
where ç_mi ^(l,m ^l ⁾(q,{dot over (q)}) and N_l<M. From (3.17-14) it also follows that
$\begin{matrix} \sum_{l = 1}^{M} \sum_{n_{l} = 1}^{l} ɛ^{(l, n_{l})} G^{(l, n_{l})} \equiv 0 & (3.17 - 15) \end{matrix}$
This identity can also be written as
$\begin{matrix} \sum δ q^{(i)} E_{i}^{(o)} - \frac{d}{dt} F where δ q^{(i)} = \sum_{l = 1}^{M} \sum_{n_{l} = 1}^{N_{l}} \sum_{m = q}^{l} {(- 1)}^{m} \frac{d^{m}}{{dt}^{m}} ς_{m}^{(l, n_{l})} ɛ^{(l, m_{l})} (t) & (3.17 - 16) \end{matrix}$
and F is a complicated function of q and {dot over (q)}. By collecting indices l, n_ltogether
$δ q_{i} = \sum_{l = 1}^{M} \sum_{n_{l} = 1}^{N_{l}} \sum_{m = q}^{l} {(- 1)}^{m} ς_{m_{i}}^{(a)} ɛ^{(a)} (t)$
Example of Constrained Hamiltonian System in Lagrangian Form
Let
L(q,{dot over (q)})=½{dot over (q)} ²⁽¹⁾ +{dot over (q)} ⁽¹⁾ q ⁽²⁾+½(q ⁽¹⁾ −q ⁽²⁾)² (3.17-17)
$\begin{matrix} E^{(o)} = [\begin{matrix} \frac{d}{dt} \frac{\partial}{\partial {\dot{q}}^{(1)}} - \frac{\partial L}{\partial q^{(1)}} \\ \frac{d}{dt} \frac{\partial}{\partial {\dot{q}}^{(2)}} - \frac{\partial L}{\partial q^{(2)}} \end{matrix}] = [\begin{matrix} {\ddot{q}}^{(1)} + 2 q^{(2)} - q^{(1)} \\ q^{(1)} - q^{(2)} \end{matrix}] & (3.17 - 18) \\ W = [\begin{matrix} 1 & 0 \\ 0 & 0 \end{matrix}] & (3.17 - 19) \\ k = [\begin{matrix} {\dot{q}}^{(2)} - q^{(1)} + q^{(2)} \\ - {\dot{q}}^{(1)} - q^{(2)} + q^{(1)} \end{matrix}] & (3.17 - 20) \end{matrix}$
The only o mode is
u ^(o)=[0,1]

Then

$\begin{matrix} \begin{matrix} E^{(o)} = W^{(o)} \ddot{q} + k^{(o)} \\ = [\begin{matrix} 1 & 0 \\ 0 & 0 \end{matrix}] [\begin{matrix} {\ddot{q}}^{(1)} \\ {\ddot{q}}^{(2)} \end{matrix}] + [\begin{matrix} {\dot{q}}^{(2)} - q^{(1)} + q^{(2)} \\ - {\dot{q}}^{(1)} - q^{(2)} + q^{(1)} \end{matrix}] \end{matrix} & (3.17 - 21) \\ Then \\ \begin{matrix} u^{(o)} E^{(o)} = [\begin{matrix} 0 & 1 \end{matrix}] [[\begin{matrix} 1 & 0 \\ 0 & 0 \end{matrix}] [\begin{matrix} {\ddot{q}}^{(1)} \\ {\ddot{q}}^{(2)} \end{matrix}] + [\begin{matrix} {\dot{q}}^{(2)} - q^{(1)} + q^{(2)} \\ - {\dot{q}}^{(1)} - q^{(2)} + q^{(1)} \end{matrix}]] \\ = - {\dot{q}}^{(1)} - q^{(2)} + q^{(1)} \\ = 0 \end{matrix} \end{matrix}$
on shell. Then there are no gauge identities for E^(o). Now construct E⁽¹⁾.
$E^{(1)} = [\begin{matrix} E^{(o)} \\ \frac{t}{dt} u^{(o)} E^{(o)} \end{matrix}] = [\begin{matrix} {\dot{q}}^{(2)} - q^{(1)} + q^{(2)} \\ - {\dot{q}}^{(1)} - q^{(2)} + q^{(1)} \\ - {\ddot{q}}^{(1)} - {\dot{q}}^{(2)} + {\dot{q}}^{(1)} \end{matrix}]$
which can be written
$\begin{matrix} E^{(1)} = W^{(1)} \ddot{q} + k^{(1)} \\ = [\begin{matrix} 0 & 0 \\ 0 & 0 \\ - 1 & 0 \end{matrix}] [\begin{matrix} {\ddot{q}}^{(1)} \\ {\ddot{q}}^{(2)} \end{matrix}] + [\begin{matrix} {\dot{q}}^{(2)} - q^{(1)} + q^{(2)} \\ - {\dot{q}}^{(1)} - q^{(2)} + q^{(1)} \\ - {\dot{q}}^{(2)} + {\dot{q}}^{(1)} \end{matrix}] \end{matrix}$
There zero modes of W⁽¹⁾are
$W^{(1)} {\begin{matrix} [\begin{matrix} 0 & 1 & 0 \end{matrix}] \\ [\begin{matrix} 1 & 0 & 1 \end{matrix}] \end{matrix}$
The first zero mode is the previous one augmented by one dimension and reproduces the previous constraint. The second mode reproduces the negative of the constraint (3.17-21). That is,
v ⁽¹⁾ E ⁽¹⁾ =−u ^(o) E ^(o)
with v⁽¹⁾=[1 0 1]. This leads to the gauge identity
G ⁽¹⁾ =v ⁽¹⁾ E ⁽¹⁾ +u ^(o) E ⁽⁰⁾≡0

Companionship: Reconciling Agents in the Network.

The outline of the companionship process is as follows for a system of N agents.

- Determine the state action space of the system for N−1 agents to create a Tellegen decision element.
- Update the remaining agent with the Tellegen DE.
- Repeat process so that all N agents are updated with respect to their Tellegen DEs.
- User submits query.
- System used KB to establish equations of motion for system in Lagrangian or Hamiltonian form.
- System determines optimal trajectory via optimization algorithm of the equations of motion that conform to the principle of least action.
- System returns solution which is a point in the phase space and also serves as an answer to the query.

It will also be appreciated that in some embodiments the functionality provided by the routines discussed above may be provided in alternative ways, such as being split among more routines or consolidated into fewer routines. Similarly, in some embodiments illustrated routines may provide more or less functionality than is described, such as when other illustrated routines instead lack or include such functionality respectively, or when the amount of functionality that is provided is altered. In addition, while various operations may be illustrated as being performed in a particular manner (e.g., in serial or in parallel, synchronously or asynchronously, etc.) and/or in a particular order, those skilled in the art will appreciate that in other embodiments the operations may be performed in other orders and in other manners. Those skilled in the art will also appreciate that the data structures discussed above may be structured in different manners, such as by having a single data structure split into multiple data structures or by having multiple data structures consolidated into a single data structure. Similarly, in some embodiments illustrated data structures may store more or less information than is described, such as when other illustrated data structures instead lack or include such information respectively, or when the amount or types of information that is stored is altered.
From the foregoing it will be appreciated that, although specific embodiments have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. Accordingly, the invention is not limited except as by the appended claims and the elements recited therein. In addition, while certain aspects of the invention are presented below in certain claim forms, the inventors contemplate the various aspects of the invention in any available claim form. For example, while only some aspects of the invention may currently be recited as being embodied in a computer-readable medium, other aspects may likewise be so embodied.

Claims

1. A computer-implemented method comprising:

receiving, by an automated control system implemented at least in part by one or more computing systems, information for use in controlling electrical power being output by a battery via electrical direct current (DC) supplied by the battery, wherein the received information includes a model based on multiple rules that each has one or more conditions to evaluate and that specify restrictions involving manipulating a DC-to-DC amplifier connected to the battery in a manner to achieve one or more defined goals including maintaining an internal state of the battery that includes an internal temperature of the battery being in a defined range during the controlling of the electrical power being output, wherein the defined range includes a range of internal temperatures of the battery at which the battery operates without causing premature damage; and

controlling, by the automated control system and based on the received information, the electrical power being output by the battery, including:

obtaining, by the automated control system and at one or more times, sensor information identifying current values for one or more attributes of the battery at the one or more times, and information about one or more electrical loads to be satisfied at the one or more times;

generating, by the automated control system and based at least in part on the obtained sensor information, one or more estimates of the internal state of the battery for the one or more times, including one or more estimates of the internal temperature of the battery for the one or more times;

determining, by the automated control system, and based at least in part on the model and on the one or more estimates of the internal state of the battery and on the one or more electrical loads, one or more amounts of electrical power for the battery to supply at the one or more times to satisfy at least some of the one or more electrical loads while maintaining the internal temperature of the battery in the defined range during the one or more times; and

implementing, by the automated control system and for each of the one or more determined amounts of electrical power, one or more settings of the DC-to-DC amplifier to cause electrical power being output to satisfy the determined amount of electrical power.

2. The computer-implemented method of claim 1 wherein the one or more times include a first time and a later second time,

wherein the one or more electrical loads include a first electrical load at the first time and a second electrical load at the second time,

wherein the determining of the one or more amounts of electrical power for the battery to supply at the one or more times includes determining to supply a first amount of electrical power at the first time that is less than the first electrical load and determining to supply a second amount of electrical power at the second time that satisfies the second electrical load, and

wherein the implementing of the one or more settings of the DC-to-DC amplifier includes implementing at least one first setting of the DC-to-DC amplifier at the first time to cause the first amount of electrical power to be supplied at the first time, and implementing at least one second setting of the DC-to-DC amplifier at the second time to cause the second amount of electrical power to be supplied at the second time.

3. The computer-implemented method of claim 2 wherein the determining to supply the first amount of electrical power at the first time that is less than the first electrical load includes determining that supplying a larger amount of electrical power at the first time to satisfy the first electrical load will cause the internal state of the battery to exceed the defined range.

4. The computer-implemented method of claim 1 wherein the determining of the one or more amounts of electrical power for the battery to supply at the one or more times includes:

determining a first amount of electrical power at a first time that satisfies a first electrical load at the first time and that the battery is capable of supplying;

determining, based at least in part on one of the one or more estimates of the internal state of the battery that is for the first time, that the internal state of the battery exceeds the defined range; and

determining, based at least in part on the determining that the internal state of the battery exceeds the defined range, a second amount of electrical power that is less than the first amount and is not sufficient to satisfy the first electrical load,

and wherein the implementing of the one or more settings of the DC-to-DC amplifier includes implementing at least one first setting of the DC-to-DC amplifier at the first time to cause the second amount of electrical power to be supplied at the first time.

5. The computer-implemented method of claim 1 wherein the DC-to-DC amplifier is a field-effect transistor (FET) amplifier, and wherein the implementing of the one or more settings of the DC-to-DC amplifier includes determining an amount of current to supply and applying a voltage to a gate of the FET amplifier to produce the determined amount of current.

6. The computer-implemented method of claim 1 wherein the DC-to-DC amplifier is part of at least one of a buck converter or a boost converter, and wherein the implementing of the one or more settings of the DC-to-DC amplifier includes determining an amount of voltage to supply, and modifying the at least one of the buck converter or the boost converter to produce the determined amount of voltage.

7. The computer-implemented method of claim 1 wherein the obtained sensor information identifies current values for electrical current, voltage and one or more temperatures associated with the battery, and wherein the generating of the one or more estimates of the internal state of the battery includes using a defined battery temperature model to estimate the internal temperature of the battery based at least in part on the current values.

8. The computer-implemented method of claim 1 wherein the obtained sensor information identifies a current value for at least one of electrical current, voltage or a temperature associated with the battery, and wherein the generating of the one or more estimates of the internal state of the battery also includes estimating an internal chemical reaction state for the battery.

9. The computer-implemented method of claim 1 wherein the obtained sensor information identifies a current value for at least one of electrical current, voltage or a temperature associated with the battery.

10. The computer-implemented method of claim 1 wherein the battery is part of a system having one or more electrically powered devices, and wherein the obtained information about the one or more electrical loads includes an amount of electrical demand from the one or more electrically powered devices.

11. The computer-implemented method of claim 1 wherein the battery is connected to an electrical grid, and wherein the obtained information about the one or more electrical loads includes requests from an operator of the electrical grid for electrical power to be supplied from the battery.

12. The computer-implemented method of claim 1 wherein the model further includes one or more rules that specify restrictions involving manipulating the DC-to-DC amplifier connected to the battery to maintain the internal state of the battery in the defined range during controlling of electrical power being supplied to the battery for charging, and wherein the method further comprises controlling, by the automated control system, the electrical power being supplied to the battery for charging, including:

obtaining, by the automated control system and at one or more additional times, additional sensor information identifying current values for one or more attributes of the battery at the one or more additional times, and information about one or more electrical supply amounts to be provided to the battery at the one or more additional times;

generating, by the automated control system and based at least in part on the obtained additional sensor information, one or more additional estimates of the internal state of the battery for the one or more additional times;

determining, by the automated control system, and based at least in part on the model and on the one or more additional estimates of the internal state of the battery and on the one or more electrical supply amounts, one or more additional amounts of electrical power for the battery to receive at the one or more additional times to accept at least some of the one or more electrical supply amounts while maintaining the internal state of the battery in the defined range during the one or more additional times; and

implementing, by the automated control system and for each of the one or more determined additional amounts of electrical power, one or more settings of the DC-to-DC amplifier to cause electrical power being supplied to the battery to satisfy the determined additional amount of electrical power.

13. The computer-implemented method of claim 12 wherein the battery is part of a system having a solar power generator, and wherein the obtained information about the one or more electrical supply amounts to be provided to the battery includes an amount of electrical supply available from the solar power generator.

14. The computer-implemented method of claim 1 wherein the model included in the received information is configured for a battery type of the battery, and wherein the method further comprises adapting the model to information specific to the battery by, during an initial training period before the controlling of the electrical power being output, monitoring changes in values of the one or more attributes of the battery and changes in electrical loads being supplied by the battery, and modifying the model to correspond to the monitored changes in values and the monitored changes in electrical loads.

15. The computer-implemented method of claim 1 wherein the controlling of the electrical power being output further includes adapting the model included in the received information to additional information specific to the battery by monitoring changes over time in values of the one or more attributes of the battery that are based at least in part on increasing impedance of the battery, and modifying the model to correspond to the monitored changes, and wherein at least one amount of electrical power for the battery to supply that is determined and is caused to be output by the implementing is based at least in part on the modified model.

16. The computer-implemented method of claim 1 wherein the automated controller system includes a battery controller component connectively coupled to the battery and further includes a control action determination component that is executing remotely from the battery on at least one of the one or more computing systems and that communicates with the battery controller component over one or more computer networks, wherein the control action determination component performs at least the determining of the one or more amounts of electrical power and further performs sending instructions over the one or more computer networks to the battery controller component about the determined one or more amounts of electrical power, and wherein the battery controller component performs at least the implementing of the one or more settings of the DC-to-DC amplifier based at least in part on the sent instructions.

17. The computer-implemented method of claim 16 further comprising, controlling, by the battery controller component and at an additional time without receiving any instructions over the one or more computer networks from the control action determination component, the electrical power being output by the battery, including:

obtaining, by the battery controller component and at the additional time, additional sensor information identifying current values for the one or more attributes of the battery at the additional time;

generating, by the battery controller component and based at least in part on the obtained additional sensor information, an additional estimate of the internal state of the battery for the additional time;

determining, by the battery controller component, and based at least in part on the additional estimate of the internal state of the battery, an additional amount of electrical power for the battery to supply at the additional time to maintain the internal state of the battery in the defined range for the additional time; and

implementing, by the battery controller component, one or more settings of the DC-to-DC amplifier to cause electrical power being output to satisfy the determined additional amount of electrical power.

18. The computer-implemented method of claim 1 further comprising, controlling, by the automated control system and at an additional time without a current electrical load, the electrical power being output by the battery, including:

obtaining, by the automated control system and at the additional time, additional sensor information identifying current values for the one or more attributes of the battery at the additional time;

generating, by the automated control system and based at least in part on the obtained additional sensor information, an additional estimate of the internal state of the battery for the additional time;

determining, by the automated control system, and based at least in part on the model and on the additional estimate of the internal state of the battery, an additional amount of electrical power for the battery to supply at the additional time to maintain the internal state of the battery in the defined range for the additional time; and

implementing, by the automated control system, one or more settings of the DC-to-DC amplifier to cause electrical power being output to satisfy the determined additional amount of electrical power.

19. The computer-implemented method of claim 1 wherein the determining of the one or more amounts of electrical power for the battery to supply at the one or more times further includes retrieving information about at least one previous amount of electrical power that the battery supplied, and selecting the determined one or more amounts of electrical power to further control a rate of change in amount of electrical power that the battery supplies, as part of maintaining the internal state of the battery in the defined range during the one or more times.

20. The computer-implemented method of claim 1 further comprising identifying at least one previous electrical load, determining one or more amounts of change in electrical load from the at least one previous electrical load to the one or more electrical loads, and further performing the selecting of the determined one or more amounts of electrical power based at least in part on the determined one or more amounts of change in electrical load.

21. The computer-implemented method of claim 1 further comprising identifying at least one previous estimate of the internal state of the battery that is based at least in part on an internal impedance of the battery, determining one or more amounts of change in estimated internal impedance of the battery from the at least one previous estimate to the one or more estimates, and further performing the selecting of the determined one or more amounts of electrical power based at least in part on the determined one or more amounts of change in the estimated internal impedance of the battery.

22. The computer-implemented method of claim 1 further comprising identifying at least one previous estimate of the internal state of the battery that is based at least in part on the internal temperature of the battery, determining one or more amounts of change in estimated internal temperature of the battery from the at least one previous estimate to the one or more estimates, and further performing the selecting of the determined one or more amounts of electrical power based at least in part on the determined one or more amounts of change in the estimated internal temperature of the battery.

23. A non-transitory computer-readable medium having stored contents that cause one or more devices implementing at least part of an automated control system to perform a method, the method comprising:

receiving, by the automated control system, information for use in controlling electrical power from a battery that supplies electrical direct current (DC), wherein the received information includes a model of the battery that specifies restrictions involving manipulating a DC-to-DC amplifier connected to the battery to achieve one or more defined goals including maintaining an internal temperature of the battery in a defined range during the controlling of the electrical power from the battery, wherein the defined range includes a range of internal temperatures of the battery at which the battery operates to reduce premature damage; and

controlling, by the automated control system and based on the received information, the electrical power from the battery, including:

obtaining, by the automated control system, sensor information identifying values for one or more attributes of the battery at multiple times, and information about electrical loads to be satisfied at the multiple times;

generating, by the automated control system and based at least in part on the obtained sensor information, estimates of the internal temperature of the battery for the multiple times;

determining, by the automated control system for one or more of the multiple times, one or more amounts of electrical power for the battery to supply at the one or more times to satisfy at least some of the electrical loads while maintaining the internal temperature of the battery in the defined range, wherein the determining is based at least in part on at least one previous amount of electrical power supplied by the battery in order to control a rate of change in amount of electrical power that the battery supplies, and wherein the determining is further based at least in part on the model and on the estimates of the internal temperature of the battery and on the electrical loads; and

24. The non-transitory computer-readable medium of claim 23 wherein the model of the battery includes one or more rules that limit changing an amount of electrical power being output from the battery to be less than a defined threshold limit for the rate of change, wherein the stored contents include software instructions that, when executed, further cause at least one of the one or more devices to identify the at least one previous amount of electrical power supplied by the battery and to determine a target rate of change that maintains the internal temperature of the battery in the defined range and that is within the defined threshold limit, and wherein the determining of the one or more amounts of electrical power for the battery to supply at the one or more times is further performed to satisfy the determined target rate of change and based at least in part on the one or more rules.

25. The non-transitory computer-readable medium of claim 23 wherein the determining of the one or more amounts of electrical power for the battery to supply at the one or more times further includes identifying at least one previous electrical load, determining one or more amounts of change in electrical load from the at least one previous electrical load to the electrical loads to be satisfied at the multiple times, and selecting at least one of the determined one or more amounts of electrical power based at least in part on the determined one or more amounts of change in electrical load.

26. The non-transitory computer-readable medium of claim 23 wherein the determining of the one or more amounts of electrical power for the battery to supply at the one or more times further includes identifying at least one previous estimate of an internal state of the battery that is based at least in part on an internal impedance of the battery, determining one or more amounts of change in estimated internal impedance of the battery from the at least one previous estimate to subsequent estimates of the internal state of the battery for the multiple times, and selecting at least one of the determined one or more amounts of electrical power based at least in part on the determined one or more amounts of change in the estimated internal impedance of the battery.

27. The non-transitory computer-readable medium of claim 23 wherein the determining of the one or more amounts of electrical power for the battery to supply at the one or more times further includes identifying at least one previous estimate of an internal state of the battery that is based at least in part on the internal temperature of the battery, determining one or more amounts of change in estimated internal temperature of the battery from the at least one previous estimate to the estimates of the internal temperature of the battery for the multiple times, and selecting at least one of the determined one or more amounts of electrical power based at least in part on the determined one or more amounts of change in the estimated internal temperature of the battery.

28. A system comprising:

one or more hardware processors of one or more devices; and

one or more memories storing instructions that, when executed by at least one of the one or more hardware processors, cause the at least one hardware processor to implement an automated control system that controls electrical power supplied from or to a battery, by:

receiving a model of the battery having specified restrictions for manipulating an actuator that controls direct current (DC) characteristics of the electrical power supplied from or to the battery;

obtaining sensor information identifying values for one or more attributes of the battery at an indicated time, and additional information about at least one of an electrical load to be satisfied at the indicated time or an electrical supply amount available to be provided to the battery at the indicated time;

generating, based at least in part on the obtained sensor information, one or more estimates for the indicated time of an internal state of the battery that is not directly observable;

determining, as part of maintaining the internal state of the battery in a defined range for the indicated time, and based at least in part on the model and on the one or more estimates of the internal state of the battery, an amount of electrical power for the battery to supply at the indicated time to satisfy at least some of the electrical load, or an amount of electrical power for the battery to receive at the indicated time by accepting at least some of the electrical supply amount; and

implementing one or more settings of the actuator to cause electrical power being output for the indicated time to satisfy the determined amount of electrical power for the battery to supply, or to cause electrical power being supplied to the battery for the indicated time to satisfy the determined amount of electrical power for the battery to receive.

29. The system of claim 28 wherein the actuator is a DC-to-DC amplifier, and wherein the automated control system controls electrical power supplied from the battery by determining the amount of electrical power for the battery to supply at the indicated time to satisfy at least some of the electrical load and by implementing one or more settings of the DC-to-DC amplifier to cause electrical power being output for the indicated time to satisfy the determined amount of electrical power for the battery to supply.

30. The system of claim 28 wherein the received model of the battery includes multiple rules that specify restrictions involving manipulating the actuator in a manner to achieve one or more defined goals including maintaining the internal state of the battery in the defined range during controlling of the electrical power supplied from or to the battery, wherein the multiple rules include one or more rules to not charge the battery if the battery has a current charge that exceeds a first defined threshold, and one or more rules to not discharge the battery if the battery has a current charge that is below a second defined threshold, and one or more rules to not charge or discharge the battery if doing so would result in an estimated internal temperature of the battery being outside a defined range.