Benchmark¶
The performance metrics of a number of experiments can be compared.
The evaluation metrics, environment’s variables, hyperparameters used during the training
and parameters for evaluating the environments are logged for each experiments in the file
benchmark/benchmark_results.csv. Evaluation metrics of selected experiments ID can be plotted
with the script scripts/plot_benchmark.py.
With the local installation¶
Flag |
Description |
Type |
Example |
|---|---|---|---|
|
List of experiments to consider for plotting |
list of int |
26 27 28 29 |
|
Name of the hyperparameter for the X axis |
str |
n_timesteps |
The arguments of the --col flag correspond to the column headings of the benchmark file.
Example:
python scripts/plot_benchmark.py --exp-list 26 27 28 29 --col n_timesteps
The plots are generated in the folder benchmark/plots/. Here is an example of experiment benchmark plot:
With Docker¶
Benchmark plots can be generated using the Docker images.
# CPU
docker run -it --rm --network host --ipc=host --mount src=$(pwd),target=/root/rl_reach/,type=bind rlreach/rlreach-cpu:latest bash -c "python scripts/plot_benchmark.py --exp-list 26 27 28 29 --col n_timesteps"
# GPU
docker run -it --rm --runtime=nvidia --network host --ipc=host --mount src=$(pwd),target=/root/rl_reach/,type=bind rlreach/rlreach-gpu:latest bash -c "python scripts/plot_benchmark.py --exp-list 26 27 28 29 --col n_timesteps"
A Shell script is provided for ease of usability.
# CPU
./docker/run_docker_cpu.sh python scripts/plot_benchmark.py --exp-list 26 27 28 29 --col n_timesteps
# GPU
./docker/run_docker_gpu.sh python scripts/plot_benchmark.py --exp-list 26 27 28 29 --col n_timesteps