eta_ctrl.agents.mpc_agent module

class eta_ctrl.agents.mpc_agent.MpcAgent(env: VecEnv, sampling_time: float, prediction_horizon: float, model_import: str, verbose: int = 1, *, solver_name: str = 'cplex', action_index: int = 0, model_parameters: dict[str, Any] | None = None, solver_options: dict[str, Any] | None = None, solver_callback: Callable[[BaseAlgorithm], None] | None = None, **kwargs: Any)[source]

Bases: BaseAlgorithm

Simple, Pyomo based optimization agent supporting multiple solvers.

The MpcAgent requires a PyomoModel which is passed via the model_import parameter. It must be defined in the config under the ‘agent_specific’ section. Common stablebaselines3 parameters are ignored for the MpcAgent as it cannot be used for learning or training. It can only be used to predict actions via predict(); use EtaCtrl.play() to run experiments.

Parameters:

policy – Agent policy. Parameter is not used in this agent
env – Environment to be optimized
sampling_time – Interval for one timestep. Used to calcucate n_prediction_steps
prediction_horizon – Duration of the prediction in seconds (usually a subsample of the episode duration)
model_import – Dotted import path to the PyomoModel subclass (e.g. "my_package.my_module.MyModel")
verbose – Logging verbosity
solver_name – Name of the solver (e.g. gurobi, cplex, or glpk). Is passed to pyomo.SolverFactory.
action_index – Index of the solution value to be used as action (by default the value for the first timestep in the solution will be used)
model_parameters – Dictionary of parameters to forward to the PyomoModel
solver_options – Dictionary of solver options (e.g. time limits or tolerances)
solver_callback – Optional callback function called after each solve step

actions_order: Specification of the order in which action values should be returned.

solver_name: str: Name of the solver to be used

action_index: Index of the solution value to be used as action (if this is 0, the first value in a list of solution values will be used).

solver_callback: Additional callback for predicting

solver: Pyomo solver instance

model: PyomoModel: PyomoModel instance

get_env() → VecEnv[source]: Helper method for type annotation.

solve() → ConcreteModel[source]

Solve the current pyomo model instance with given parameters. This could also be used separately to solve normal MILP problems. Since the entire problem instance is returned, result handling can be outsourced.

Returns:: Solved pyomo model instance.

predict(observation: np.ndarray | dict[str, np.ndarray], state: tuple[np.ndarray, ...] | None = None, episode_start: np.ndarray | None = None, deterministic: bool = False) → tuple[np.ndarray, tuple[np.ndarray, ...] | None][source]

Solve the current pyomo model instance with given parameters and observations and return the optimal actions.

Parameters:

observation – the input observation (not used here).
state – The last states (not used here).
episode_start – The last masks (not used here).
deterministic – Whether to return deterministic actions. This agent always returns deterministic actions.

Returns:

Tuple of the model’s action and the next state (not used here).

learn(total_timesteps: int, callback: MaybeCallback = None, log_interval: int = 100, tb_log_name: str = 'run', reset_num_timesteps: bool = True, progress_bar: bool = False) → Self[source]

The MPC approach cannot learn a new model. Specify the model attribute as a pyomo Concrete model instead, to use the prediction function of this agent.

Parameters:

total_timesteps – The total number of samples (env steps) to train on
callback – callback(s) called at every step with state of the algorithm.
log_interval – The number of timesteps before logging.
tb_log_name – the name of the run for TensorBoard logging
reset_num_timesteps – whether or not to reset the current timestep number (used in logging)
progress_bar – Display a progress bar using tqdm and rich.

Returns:

The trained model.

handle_solve_failed(result: Any) → None[source]

Called when the solver did not reach an optimal solution.

If a feasible (suboptimal) solution exists, logs a warning and returns so the caller can continue with that solution. If no feasible solution exists, logs full diagnostics and raises an InfeasibleConstraintException.

Parameters:: result – Result object returned by the Pyomo solver.