如何在 OpenAI 的健身房中注册自定义环境?

本文介绍了如何在 OpenAI 的健身房中注册自定义环境?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我根据 OpenAI Gym 框架创建了一个自定义环境；包含 step、reset、action 和 reward 函数.我的目标是在这个自定义环境上运行 OpenAI 基线.但在此之前，环境必须在 OpenAI 健身房注册.我想知道如何在 OpenAI 健身房注册自定义环境?另外，我是否应该修改 OpenAI 基线代码以包含此内容?

I have created a custom environment, as per the OpenAI Gym framework; containing step, reset, action, and reward functions. I aim to run OpenAI baselines on this custom environment. But prior to this, the environment has to be registered on OpenAI gym. I would like to know how the custom environment could be registered on OpenAI gym? Also, Should I be modifying the OpenAI baseline codes to incorporate this?

推荐答案

你不需要修改baselines repo.

You do not need to modify baselines repo.

这是一个最小的例子.假设您有 myenv.py，以及所有需要的功能(step、reset、...).类环境的名称是MyEnv，您希望将其添加到classic_control 文件夹中.你必须

Here is a minimal example. Say you have myenv.py, with all the needed functions (step, reset, ...). The name of the class environment is MyEnv, and you want to add it to the classic_control folder. You have to

将 myenv.py 文件放在 gym/gym/envs/classic_control
添加到__init__.py(位于同一文件夹中)

Place myenv.py file in gym/gym/envs/classic_control
Add to __init__.py (located in the same folder)

fromgym.envs.classic_control.myenv import MyEnv

通过添加

gym.envs.register(
     id='MyEnv-v0',
     entry_point='gym.envs.classic_control:MyEnv',
     max_episode_steps=1000,
)

在注册时，您还可以添加 reward_threshold 和 kwargs(如果您的类需要一些参数).
您也可以直接在您将运行的脚本(TRPO、PPO 或其他)中注册环境，而不是在 gym/gym/envs/__init__.py 中进行.

At registration, you can also add reward_threshold and kwargs (if your class takes some arguments).
You can also directly register the environment in the script you will run (TRPO, PPO, or whatever) instead of doing it in gym/gym/envs/__init__.py.

编辑

这是创建 LQR 环境的最小示例.

This is a minimal example to create the LQR environment.

将下面的代码保存在lqr_env.py中，放到gym的classic_control文件夹中.

Save the code below in lqr_env.py and place it in the classic_control folder of gym.

import gym
from gym import spaces
from gym.utils import seeding
import numpy as np

class LqrEnv(gym.Env):

    def __init__(self, size, init_state, state_bound):
        self.init_state = init_state
        self.size = size
        self.action_space = spaces.Box(low=-state_bound, high=state_bound, shape=(size,))
        self.observation_space = spaces.Box(low=-state_bound, high=state_bound, shape=(size,))
        self._seed()

    def _seed(self, seed=None):
        self.np_random, seed = seeding.np_random(seed)
        return [seed]

    def _step(self,u):
        costs = np.sum(u**2) + np.sum(self.state**2)
        self.state = np.clip(self.state + u, self.observation_space.low, self.observation_space.high)
        return self._get_obs(), -costs, False, {}

    def _reset(self):
        high = self.init_state*np.ones((self.size,))
        self.state = self.np_random.uniform(low=-high, high=high)
        self.last_u = None
        return self._get_obs()

    def _get_obs(self):
        return self.state

将 fromgym.envs.classic_control.lqr_env import LqrEnv 添加到 __init__.py(也在 classic_control 中).

Add from gym.envs.classic_control.lqr_env import LqrEnv to __init__.py (also in classic_control).

在你的脚本中，当你创建环境时，做

In your script, when you create the environment, do

gym.envs.register(
     id='Lqr-v0',
     entry_point='gym.envs.classic_control:LqrEnv',
     max_episode_steps=150,
     kwargs={'size' : 1, 'init_state' : 10., 'state_bound' : np.inf},
)
env = gym.make('Lqr-v0')

这篇关于如何在 OpenAI 的健身房中注册自定义环境?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！