{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Customising the environment class - Actions and Observations\n",
    "\n",
    "For more complex systems (and therefore simulink models) the definition of other action and observation spaces are very useful tools. Also, there can be inputs which are external and not controlled by the agent. In the following example this will be shown for an example from the area of energy systems.\n",
    "\n",
    "As an example the energy managenment of a DC microgrid is chosen. The microgrid consist of a PV array, a battery storage, a load and a grid connection. The corresponding simulink file (MicrogridExample.slx) and the necessary simscape-files (shortCircuit.ssc, VariablePowerLoad.ssc and Wire_2Cond.ssc) are included. The main idea is to optimise the use of the battery storage to minimise the energy consuption from the main grid and maximise the use of the PV energy.\n",
    "\n",
    "As first step, the config-file is read and a FMU is generated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning: The data dictionary 'BusSystem.sldd' was not found.\n",
      "Warning: The data dictionary 'BusSystem.sldd' was not found.\n",
      "Setting System Target to FMU Co-Simulation for model 'MicrogridExample'.\n",
      "Setting Hardware Implementation > Device Type to 'MATLAB Host' for model 'MicrogridExample'.\n",
      "### 'GenerateComments' is disabled for Co-Simulation Standalone FMU Export.\n",
      "### Generating code for Physical Networks associated with solver block 'MicrogridExample/GridFMU/Solver/Solver Configuration' ...\n",
      "done.\n",
      "Warning: The following parameters are not exported due to Code Generation limitations: 'Ts'. Update the storage class of the parameter to get the expected results.\n",
      "\n",
      "Build Summary\n",
      "\n",
      "Top model targets built:\n",
      "\n",
      "Model             Action                        Rebuild Reason                                    \n",
      "==================================================================================================\n",
      "MicrogridExample  Code generated and compiled.  Code generation information file does not exist.  \n",
      "\n",
      "1 of 1 models built (0 models already up to date)\n",
      "Build duration: 0h 0m 11.668s\n",
      "### Model was successfully exported to co-simulation standalone FMU: '/home/cao2851/git/stablerls/examples/MicrogridExample.fmu'.\n"
     ]
    }
   ],
   "source": [
    "# import packages as in the other examples\n",
    "import stablerls.gymFMU as gymFMU\n",
    "import stablerls.configreader as cfg_reader\n",
    "import stablerls.createFMU as createFMU\n",
    "import numpy as np\n",
    "import logging\n",
    "\n",
    "# normally we dont recommend the info-logging but here it is used for demonstration\n",
    "logging.basicConfig(level=logging.INFO)\n",
    "\n",
    "# read config-file\n",
    "config = cfg_reader.configreader('03-config.cfg')\n",
    "\n",
    "# create FMU\n",
    "createFMU.createFMU(config,'MicrogridExample.slx')"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After the generation of the FMU, an instance of the corresponding gymnasium environment is created. Because information logging is active, all in- and outputs are listed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:stablerls.fmutools:Using: 03-MicrogridFMU.fmu\n",
      "INFO:stablerls.fmutools:Unzipped in /tmp/tmpy4lw3xsk\n",
      "INFO:stablerls.fmutools:Found inputs - access them by the corresponding number:\n",
      "INFO:stablerls.fmutools: 0: Control.InptPV.ModuleTemperature.ModuleTemperature\n",
      "INFO:stablerls.fmutools: 1: Control.InptPV.Irradiance.Irradiance\n",
      "INFO:stablerls.fmutools: 2: Control.InptLoad.LoadPower\n",
      "INFO:stablerls.fmutools: 3: Control.InptBattery.VoltageReference\n",
      "INFO:stablerls.fmutools: 4: Control.InptBattery.SOC_Init\n",
      "INFO:stablerls.fmutools: 5: Control.InptGrid.VoltageReference\n",
      "INFO:stablerls.fmutools:Found outputs - access them by the corresponding number:\n",
      "INFO:stablerls.fmutools: 0: Measurement.PV.V\n",
      "INFO:stablerls.fmutools: 1: Measurement.PV.P\n",
      "INFO:stablerls.fmutools: 2: Measurement.PV.I\n",
      "INFO:stablerls.fmutools: 3: Measurement.Grid.V\n",
      "INFO:stablerls.fmutools: 4: Measurement.Grid.P\n",
      "INFO:stablerls.fmutools: 5: Measurement.Grid.I\n",
      "INFO:stablerls.fmutools: 6: Measurement.Load.V\n",
      "INFO:stablerls.fmutools: 7: Measurement.Load.I\n",
      "INFO:stablerls.fmutools: 8: Measurement.Battery.Pbat\n",
      "INFO:stablerls.fmutools: 9: Measurement.Battery.Vbat\n",
      "INFO:stablerls.fmutools: 10: Measurement.Battery.Ibat\n",
      "INFO:stablerls.fmutools: 11: Measurement.Battery.SOC\n",
      "INFO:stablerls.fmutools: 12: Measurement.Battery.Inet\n",
      "INFO:stablerls.fmutools: 13: Measurement.Battery.Vnet\n"
     ]
    }
   ],
   "source": [
    "# create instance of the model\n",
    "env = gymFMU.StableRLS(config)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A look at the input list shows that only some of them shall be controlled by the agent while others depend on environmental parameters like the actual weather or are fixed system parameters. In this example, only the voltage references for grid and battery converters are to be controlled. For irradiance and module temperature for the PV module model real data from an external source is imported. The values for the initial SOC at the start of a training episode are randomised to cover many possible scenarios. For the reference voltages, only discrete steps shall be possible, so two discrete actions with 11 possible states each are defined in the action space. During the action assignment, they are converted into reference steps between -0.4 and 0.4 V.\n",
    "\n",
    "Additionally, 13 outputs are a lot and not necessary for the agent since some of them contain similiar information. Knowledge about the voltages and source currents as well as the battery SOC is enough. This results in 8 observations which are continous, but can only take limited values due to a normalisation. They will be normalised to values between +-1 in the observation processing. For the normalisation the nominal voltages and currents are added to the config file:\n",
    "\n",
    "```\n",
    "[General]\n",
    "nominal_voltage = 48.0\n",
    "nominal_current = 25.0\n",
    "```"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Therefore, the spaces needs to modified. For this, a new class is defined which is inherited from the StableRLS base class and the corresponding setter methods are overriden. In addition to the setter methods, the methods for assignment of the actions and the observation processing are overridden to be coherent with the new spaces. For the external, weather dependent values the pre-defined function *FMU_external_input* is used. Here only fixed values are taken, but in the next example files with the weather data are read.\n",
    "\n",
    "In addition, the *reset_* function is changed to choose a random SOC for the start of the simulation and set the action signals for the first run. This function could further be expanded by reading real weather data, e. g. for different times."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import stablerls.gymFMU as gymFMU\n",
    "import stablerls.configreader as cfg_reader\n",
    "import stablerls.createFMU as createFMU\n",
    "import gymnasium as gym\n",
    "import numpy as np\n",
    "import logging\n",
    "import random\n",
    "\n",
    "logger = logging.getLogger(__name__)\n",
    "\n",
    "class GridEnv(gymFMU.StableRLS):\n",
    "    def set_action_space(self):\n",
    "        \"\"\"Setter function for the action space of the agent. \n",
    "        This function overrides the base implementation of StableRLS to \n",
    "        choose only certain inputs as actions.\n",
    "\n",
    "        Returns\n",
    "        -------\n",
    "        space : gymnasium.space\n",
    "            Returns the action space defined by specified FMU inputs\n",
    "        \"\"\"\n",
    "        return gym.spaces.MultiDiscrete([11, 11])\n",
    "    \n",
    "    def set_observation_space(self):\n",
    "        \"\"\"Setter function for the observation space of the agent. \n",
    "        This function overrides the base implementation of StableRLS to \n",
    "        choose only certain outputs as observations.\n",
    "\n",
    "        Returns\n",
    "        -------\n",
    "        space : gymnasium.space\n",
    "            Returns the observation space defined by specified FMU outputs\n",
    "        \"\"\"\n",
    "        high = np.arange(8).astype(np.float32)\n",
    "        high[:] = 1\n",
    "        low = high * -1\n",
    "        return gym.spaces.Box(low, high)\n",
    "    \n",
    "    def assignAction(self, action):\n",
    "        \"\"\"Changed assignment of actions to the FMU because only certain inputs\n",
    "        are used for the agent actions.\n",
    "\n",
    "        Parameters\n",
    "        ----------\n",
    "        action : list\n",
    "            An action provided by the agent to update the environment state.\n",
    "        \"\"\"\n",
    "        # assign actions to inputs\n",
    "        # check if actions are within action space\n",
    "        if not self.action_space.contains(action):\n",
    "            logger.info(f\"The actions are not within the action space. Action: {action}. Time: {self.time}\")\n",
    "\n",
    "        # convert discrete actions into steps of voltage references\n",
    "        vStepGrid = (action[0] - 5) * 0.04    \n",
    "        vStepBat = (action[1] - 5) * 0.04\n",
    "\n",
    "        # add them to actual references to get new setpoints\n",
    "        vRefGrid = self.fmu.fmu.getReal([self.fmu.input[3].valueReference])[0] + vStepGrid  \n",
    "        vRefBat = self.fmu.fmu.getReal([self.fmu.input[5].valueReference])[0] + vStepBat\n",
    "\n",
    "        # assign actions to the FMU inputs - take care of right indices!\n",
    "        self.fmu.fmu.setReal([self.fmu.input[3].valueReference], [vRefGrid])\n",
    "        self.fmu.fmu.setReal([self.fmu.input[5].valueReference], [vRefBat])\n",
    "\n",
    "    def obs_processing(self, raw_obs):\n",
    "        \"\"\"Customised action processing: Only specific outputs are evaluated. Additionally,\n",
    "        they are normalised to +-1.\n",
    "\n",
    "        Parameters\n",
    "        ----------\n",
    "        raw_obs : ObsType\n",
    "            The raw observation defined by all FMU outputs.\n",
    "        Returns\n",
    "        -------\n",
    "        observation : ObsType\n",
    "            The processed observation for the agent.\n",
    "        \"\"\"\n",
    "\n",
    "        nDec = 2\n",
    "        observation = np.array([  round((raw_obs[0] - self.nominal_voltage) / (0.1*self.nominal_voltage), 2),   # PV.V\n",
    "                                  round((raw_obs[3] - self.nominal_voltage) / (0.1*self.nominal_voltage), 2),   # Grid.V                     \n",
    "                                  round((raw_obs[6] - self.nominal_voltage) / (0.1*self.nominal_voltage), 2),   # Load.V\n",
    "                                  round((raw_obs[13] - self.nominal_voltage) / (0.1*self.nominal_voltage), 2),  # Bat.V  \n",
    "                                  round(raw_obs[2] / self.nominal_current, 2),                                  # PV.I\n",
    "                                  round(raw_obs[5] / self.nominal_current, 2),                                  # Grid.I\n",
    "                                  round(raw_obs[12] / self.nominal_current, 2),                                 # Bat.Inet\n",
    "                                  round(raw_obs[11], nDec),                                                     # Bat.SOC\n",
    "                               ]).astype(np.float32)\n",
    "\n",
    "        return observation\n",
    "    \n",
    "    def reset_(self, seed=None):\n",
    "        \"\"\"Since zeros make no sense for the voltage references, the input reset is\n",
    "        changed. Also, the initial SOC is chosen randomly between limits.\n",
    "\n",
    "        Parameters\n",
    "        ----------\n",
    "        seed : int, optional\n",
    "            - None -\n",
    "        \"\"\"\n",
    "        # set voltage references to nominal voltage\n",
    "        self.fmu.fmu.setReal([self.fmu.input[3].valueReference], [self.nominal_voltage])\n",
    "        self.fmu.fmu.setReal([self.fmu.input[5].valueReference], [self.nominal_voltage])\n",
    "\n",
    "        # set initial SOC, randomly chosen\n",
    "        rand = random.random()\n",
    "        if rand < 0.15:\n",
    "            self.soc_init = 0.15\n",
    "        elif rand > 0.85:\n",
    "            self.soc_init = 0.85\n",
    "        else:\n",
    "            self.soc_init = rand \n",
    "        self.fmu.fmu.setReal([self.fmu.input[4].valueReference], [self.soc_init])\n",
    "\n",
    "        # all other inputs are set during the calculation of the first step since \n",
    "        # they are external\n",
    "\n",
    "        # get the first observation as specified by gymnaisum\n",
    "        self._next_observation(steps=1)\n",
    "        return self.obs_processing(self.outputs[self.step_count, :])\n",
    "\n",
    "    def FMU_external_input(self):\n",
    "        \"\"\"This function is called before each FMU step. Here external FMU\n",
    "        inputs independent of the agent action are set. In this case, this includes\n",
    "        weather data and the load power.\n",
    "\n",
    "        Use the code below to access the FMU inputs.\n",
    "        self.fmu.fmu.setReal([self.fmu.input[0].valueReference], [value])\n",
    "        \"\"\"\n",
    "        # Irradiance\n",
    "        self.fmu.fmu.setReal([self.fmu.input[1].valueReference], [1000.0])\n",
    "\n",
    "        # ModuleTemperature\n",
    "        self.fmu.fmu.setReal([self.fmu.input[0].valueReference], [30.0])\n",
    "\n",
    "        # LoadPower\n",
    "        self.fmu.fmu.setReal([self.fmu.input[2].valueReference], [500.0])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With this newly defined class, a new environment object is created and simulated for 10 steps. As you can see, the observations now consist of the normalised eight values which correspond to the voltages and source currents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:stablerls.fmutools:Using: 03-MicrogridFMU.fmu\n",
      "INFO:stablerls.fmutools:Unzipped in /tmp/tmp_86hm__d\n",
      "INFO:stablerls.fmutools:Found inputs - access them by the corresponding number:\n",
      "INFO:stablerls.fmutools: 0: Control.InptPV.ModuleTemperature.ModuleTemperature\n",
      "INFO:stablerls.fmutools: 1: Control.InptPV.Irradiance.Irradiance\n",
      "INFO:stablerls.fmutools: 2: Control.InptLoad.LoadPower\n",
      "INFO:stablerls.fmutools: 3: Control.InptBattery.VoltageReference\n",
      "INFO:stablerls.fmutools: 4: Control.InptBattery.SOC_Init\n",
      "INFO:stablerls.fmutools: 5: Control.InptGrid.VoltageReference\n",
      "INFO:stablerls.fmutools:Found outputs - access them by the corresponding number:\n",
      "INFO:stablerls.fmutools: 0: Measurement.PV.V\n",
      "INFO:stablerls.fmutools: 1: Measurement.PV.P\n",
      "INFO:stablerls.fmutools: 2: Measurement.PV.I\n",
      "INFO:stablerls.fmutools: 3: Measurement.Grid.V\n",
      "INFO:stablerls.fmutools: 4: Measurement.Grid.P\n",
      "INFO:stablerls.fmutools: 5: Measurement.Grid.I\n",
      "INFO:stablerls.fmutools: 6: Measurement.Load.V\n",
      "INFO:stablerls.fmutools: 7: Measurement.Load.I\n",
      "INFO:stablerls.fmutools: 8: Measurement.Battery.Pbat\n",
      "INFO:stablerls.fmutools: 9: Measurement.Battery.Vbat\n",
      "INFO:stablerls.fmutools: 10: Measurement.Battery.Ibat\n",
      "INFO:stablerls.fmutools: 11: Measurement.Battery.SOC\n",
      "INFO:stablerls.fmutools: 12: Measurement.Battery.Inet\n",
      "INFO:stablerls.fmutools: 13: Measurement.Battery.Vnet\n",
      "INFO:stablerls.gymFMU:Simulation done\n",
      "INFO:stablerls.fmutools:Close fmu and try deleting unzipdir\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Action: [5 5]\n",
      "Observation: [-0.12 -0.1  -0.17 -0.1   0.    0.11  0.17  0.15]\n",
      "\n",
      "Action: [5 5]\n",
      "Observation: [-0.13 -0.1  -0.19 -0.1   0.01  0.15  0.22  0.15]\n",
      "\n",
      "Action: [5 5]\n",
      "Observation: [-0.13 -0.1  -0.19 -0.1   0.01  0.16  0.24  0.15]\n",
      "\n",
      "Action: [5 5]\n",
      "Observation: [-0.13 -0.1  -0.19 -0.1   0.02  0.16  0.24  0.15]\n",
      "\n",
      "Action: [5 5]\n",
      "Observation: [-0.13 -0.1  -0.19 -0.1   0.02  0.16  0.24  0.15]\n",
      "\n",
      "Action: [5 5]\n",
      "Observation: [-0.13 -0.1  -0.19 -0.1   0.03  0.16  0.24  0.15]\n",
      "\n",
      "Action: [5 5]\n",
      "Observation: [-0.13 -0.1  -0.19 -0.1   0.03  0.16  0.23  0.15]\n",
      "\n",
      "Action: [5 5]\n",
      "Observation: [-0.12 -0.1  -0.19 -0.1   0.04  0.16  0.23  0.15]\n",
      "\n",
      "Action: [5 5]\n",
      "Observation: [-0.12 -0.1  -0.19 -0.1   0.04  0.16  0.23  0.15]\n",
      "\n",
      "Action: [5 5]\n",
      "Observation: [-0.12 -0.1  -0.19 -0.1   0.05  0.16  0.22  0.15]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# change logging level to warnings only\n",
    "logging.basicConfig(level=logging.WARNING)\n",
    "\n",
    "# read config-file, this stay the same for the changed environment\n",
    "config = cfg_reader.configreader('03-config.cfg')\n",
    "\n",
    "# create new env object and reset it before simulating\n",
    "microgrid = GridEnv(config)\n",
    "obs = microgrid.reset()\n",
    "\n",
    "# for this example, the actions are kept constant at the reference value of 48 V\n",
    "action = np.array([5,5])\n",
    "\n",
    "terminated = False\n",
    "truncated = False\n",
    "while not (terminated or truncated):\n",
    "    observation, reward, terminated, truncated, info  = microgrid.step(action)\n",
    "    print(f'Action: {action}\\nObservation: {observation}\\n')\n",
    "\n",
    "microgrid.close()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}