Control Algorithms

The agents implemented in eta_ctrl.agents are subclasses of stable_baselines3.common.base_class.BaseAlgorithm in stable_baselines3. Calling them agents is a remnant from stable_baselines2 (the wording was changed in stable_baselines3).

Usually there is no need to dive more deeply into the agents provided by ETA Ctrl. You can use them by specifying their import path in your experiment configuration and don’t have to worry about how they work. It is good to know however, that some agents do not implement all methods which would be required by the interface in normal usage. Within the ETA Ctrl framework this usually isn’t a problem, because the methods are not used.

The currently available agents are listed here. Note that you need to specify the parameters required for instantiation in the agent_specific section of the ETA Ctrl configuration file.

MPC Agent

The MpcAgent agent implements a model predictive controller. It can be used to solve mathematical models in conjunction with mathematical solvers such as gurobi, cplex or glpk and it relies on the pyomo library to achieve this.

You can provide additional arguments in kwargs to the agent. These will be interpreted first as arguments for the base class and then for the solver. Meaning that arguments which are passed to MpcAgent and not recognized by BaseAlgorithm will be passed on to the solver. This allows free configuration of all solver options.

class eta_ctrl.agents.MpcAgent(env: VecEnv, sampling_time: float, prediction_horizon: float, model_import: str, verbose: int = 1, *, solver_name: str = 'cplex', action_index: int = 0, model_parameters: dict[str, Any] | None = None, solver_options: dict[str, Any] | None = None, solver_callback: Callable[[BaseAlgorithm], None] | None = None, **kwargs: Any)[source]

Simple, Pyomo based optimization agent supporting multiple solvers.

The MpcAgent requires a PyomoModel which is passed via the model_import parameter. It must be defined in the config under the ‘agent_specific’ section. Common stablebaselines3 parameters are ignored for the MpcAgent as it cannot be used for learning or training. It can only be used to predict actions via predict(); use EtaCtrl.play() to run experiments.

Parameters:

policy – Agent policy. Parameter is not used in this agent
env – Environment to be optimized
sampling_time – Interval for one timestep. Used to calcucate n_prediction_steps
prediction_horizon – Duration of the prediction in seconds (usually a subsample of the episode duration)
model_import – Dotted import path to the PyomoModel subclass (e.g. "my_package.my_module.MyModel")
verbose – Logging verbosity
solver_name – Name of the solver (e.g. gurobi, cplex, or glpk). Is passed to pyomo.SolverFactory.
action_index – Index of the solution value to be used as action (by default the value for the first timestep in the solution will be used)
model_parameters – Dictionary of parameters to forward to the PyomoModel
solver_options – Dictionary of solver options (e.g. time limits or tolerances)
solver_callback – Optional callback function called after each solve step

Rule Based Agent (Base Class)

The rule based agent is a base class which facilitates the creation of simple rule based agents. To use it, you need to implement the eta_ctrl.agents.RuleBased.control_rules() method. The control_rules method takes the array of observations from the environment and determines an array of actions based on them.

class eta_ctrl.agents.RuleBased(policy: type[BasePolicy], env: VecEnv, verbose: int = 4, _init_setup_model: bool = True, **kwargs: Any)[source]

The rule based agent base class provides the facilities to easily build a complete rule based agent. To achieve this, only the control_rules function must be implemented. It should take an observation from the environment as input and provide actions as an output.

Parameters:

policy – Agent policy. Parameter is not used in this agent and can be set to NoPolicy.
env – Environment to be controlled.
verbose – Logging verbosity.
kwargs – Additional arguments as specified in stable_baselines3.common.base_class.

abstractmethod control_rules(observation: ndarray | dict[str, ndarray]) → ndarray[source]

This function is abstract and should be used to implement control rules which determine actions from the received observations.

Parameters:: observation – Observations as provided by a single, non vectorized environment.
Returns:: Action values, as determined by the control rules.