Welcome to StableRLS

This first tutorial shows how to: 1. import the package 2. create a FMU from your simulink model 3. create and simulate the environment

The basic config file has three sections: General, FMU and Reinforcement Learning. By default the parameters of all sections will be available within the environment class. The documentation provides more information about all available config options. To get started we need to set: - FMU_path : location the FMU is stored. If the FMU is created this is also used as target location. - stop_time : when the time is reached the simulation/ episode is terminated. - dt : we run a fixed step simulation and this is the timestep.

[General]

[FMU]
FMU_path = 00-Simulink_Linux.fmu
stop_time =  1
dt = 0.5

[Reinforcement Learning]

[1]:

# this contains the environment class
import stablerls.gymFMU as gymFMU
# this will read our config file
import stablerls.configreader as cfg_reader

import numpy as np
import logging
# normally we dont recommend the Info logging but here its used for demonstation
logging.basicConfig(level=logging.INFO)

For simplicity we already included the compiled FMU models for Linux and Windows. However, if you own Matlab you can compile the *.slx models on your own. If you want to compile the model you can keep the default FMU_path in the config file. Otherwise please change it to 00-Simulink_Windows.fmu or 00-Simulink_Linux.fmu depending on your operating system.

[2]:

# First of all we have to read the config file
config = cfg_reader.configreader('00-config.cfg')

# if we want to we can compile the simulink model.
# Matlab and Matlab Engine for python is required!
if True:
    import stablerls.createFMU as createFMU
    createFMU.createFMU(config,'SimulinkExample00.slx')

Warning: The data dictionary 'BusSystem.sldd' was not found.
Warning: The data dictionary 'BusSystem.sldd' was not found.
Setting System Target to FMU Co-Simulation for model 'SimulinkExample00'.
Setting Hardware Implementation > Device Type to 'MATLAB Host' for model 'SimulinkExample00'.
### 'GenerateComments' is disabled for Co-Simulation Standalone FMU Export.

Build Summary

Top model targets built:

Model              Action                        Rebuild Reason
===================================================================================================
SimulinkExample00  Code generated and compiled.  Code generation information file does not exist.

1 of 1 models built (0 models already up to date)
Build duration: 0h 0m 6.3136s
### Model was successfully exported to co-simulation standalone FMU: '/home/cao2851/git/stablerls/examples/SimulinkExample00.fmu'.

The FMU is available now and the default options of the StableRLS gymnasium environment are sufficient to run the first simulation.

[3]:

# create instance of the model
env = gymFMU.StableRLS(config)

# default reset call bevor the simulation starts
obs = env.reset()

# we wont change the action
action = np.array([1,2,3,4])

terminated = False
truncated = False
while not (terminated or truncated):
    observation, reward, terminated, truncated, info  = env.step(action)
    print(f'Action: {action}\nObservation: {observation}\nReward: {reward}\n')

env.close()

INFO:stablerls.fmutools:Using: 00-Simulink_Linux.fmu
INFO:stablerls.fmutools:Unzipped in /tmp/tmphq_9jw1t
INFO:stablerls.fmutools:Found inputs - access them by the corresponding number:
INFO:stablerls.fmutools: 0: Control.Sum2
INFO:stablerls.fmutools: 1: Control.Sum1
INFO:stablerls.fmutools: 2: Control.Terminated.Signal1
INFO:stablerls.fmutools: 3: Control.Terminated.Signal2
INFO:stablerls.fmutools:Found outputs - access them by the corresponding number:
INFO:stablerls.fmutools: 0: Measurement.SumOut
INFO:stablerls.gymFMU:Simulation done

Action: [1 2 3 4]
Observation: [3.]
Reward: 1

INFO:stablerls.fmutools:Close fmu and try deleting unzipdir

Action: [1 2 3 4]
Observation: [3.]
Reward: 1

The actions and outputs of each simulation/ episode are stored within the class. (Reset when we call the reset function)

[4]:

env.inputs

[4]:

array([[0., 0., 0., 0.],
       [1., 2., 3., 4.],
       [1., 2., 3., 4.]])

[5]:

env.outputs

[5]:

array([[0.],
       [3.],
       [3.]])