eta_ctrl.agents.mpc_agent module

class eta_ctrl.agents.mpc_agent.MpcAgent(env: VecEnv, sampling_time: float, prediction_horizon: float, model_import: str, verbose: int = 1, *, solver_name: str = 'cplex', action_index: int = 0, model_parameters: dict[str, Any] | None = None, solver_options: dict[str, Any] | None = None, solver_callback: Callable[[BaseAlgorithm], None] | None = None, **kwargs: Any)[source]

Bases: BaseAlgorithm

Simple, Pyomo based optimization agent supporting multiple solvers.

The MpcAgent requires a PyomoModel which is passed via the model_import parameter. It must be defined in the config under the ‘agent_specific’ section. Common stablebaselines3 parameters are ignored for the MpcAgent as it cannot be used for learning or training. It can only be used to predict actions via predict(); use EtaCtrl.play() to run experiments.

Parameters:
  • policy – Agent policy. Parameter is not used in this agent

  • env – Environment to be optimized

  • sampling_time – Interval for one timestep. Used to calcucate n_prediction_steps

  • prediction_horizon – Duration of the prediction in seconds (usually a subsample of the episode duration)

  • model_import – Dotted import path to the PyomoModel subclass (e.g. "my_package.my_module.MyModel")

  • verbose – Logging verbosity

  • solver_name – Name of the solver (e.g. gurobi, cplex, or glpk). Is passed to pyomo.SolverFactory.

  • action_index – Index of the solution value to be used as action (by default the value for the first timestep in the solution will be used)

  • model_parameters – Dictionary of parameters to forward to the PyomoModel

  • solver_options – Dictionary of solver options (e.g. time limits or tolerances)

  • solver_callback – Optional callback function called after each solve step

actions_order

Specification of the order in which action values should be returned.

solver_name: str

Name of the solver to be used

action_index

Index of the solution value to be used as action (if this is 0, the first value in a list of solution values will be used).

solver_callback

Additional callback for predicting

solver

Pyomo solver instance

model: PyomoModel

PyomoModel instance

get_env() VecEnv[source]

Helper method for type annotation.

solve() ConcreteModel[source]

Solve the current pyomo model instance with given parameters. This could also be used separately to solve normal MILP problems. Since the entire problem instance is returned, result handling can be outsourced.

Returns:

Solved pyomo model instance.

predict(observation: np.ndarray | dict[str, np.ndarray], state: tuple[np.ndarray, ...] | None = None, episode_start: np.ndarray | None = None, deterministic: bool = False) tuple[np.ndarray, tuple[np.ndarray, ...] | None][source]

Solve the current pyomo model instance with given parameters and observations and return the optimal actions.

Parameters:
  • observation – the input observation (not used here).

  • state – The last states (not used here).

  • episode_start – The last masks (not used here).

  • deterministic – Whether to return deterministic actions. This agent always returns deterministic actions.

Returns:

Tuple of the model’s action and the next state (not used here).

learn(total_timesteps: int, callback: MaybeCallback = None, log_interval: int = 100, tb_log_name: str = 'run', reset_num_timesteps: bool = True, progress_bar: bool = False) Self[source]

The MPC approach cannot learn a new model. Specify the model attribute as a pyomo Concrete model instead, to use the prediction function of this agent.

Parameters:
  • total_timesteps – The total number of samples (env steps) to train on

  • callback – callback(s) called at every step with state of the algorithm.

  • log_interval – The number of timesteps before logging.

  • tb_log_name – the name of the run for TensorBoard logging

  • reset_num_timesteps – whether or not to reset the current timestep number (used in logging)

  • progress_bar – Display a progress bar using tqdm and rich.

Returns:

The trained model.

handle_solve_failed(result: Any) None[source]

Called when the solver did not reach an optimal solution.

If a feasible (suboptimal) solution exists, logs a warning and returns so the caller can continue with that solution. If no feasible solution exists, logs full diagnostics and raises an InfeasibleConstraintException.

Parameters:

result – Result object returned by the Pyomo solver.