问题描述
我根据 OpenAI Gym 框架创建了一个自定义环境;包含 step
、reset
、action
和 reward
函数.我的目标是在这个自定义环境上运行 OpenAI 基线.但在此之前,环境必须在 OpenAI 健身房注册.我想知道如何在 OpenAI 健身房注册自定义环境?另外,我是否应该修改 OpenAI 基线代码以包含此内容?
I have created a custom environment, as per the OpenAI Gym framework; containing step
, reset
, action
, and reward
functions. I aim to run OpenAI baselines on this custom environment. But prior to this, the environment has to be registered on OpenAI gym. I would like to know how the custom environment could be registered on OpenAI gym? Also, Should I be modifying the OpenAI baseline codes to incorporate this?
推荐答案
你不需要修改baselines repo.
You do not need to modify baselines repo.
这是一个最小的例子.假设您有 myenv.py
,以及所有需要的功能(step
、reset
、...).类环境的名称是MyEnv
,您希望将其添加到classic_control
文件夹中.你必须
Here is a minimal example. Say you have myenv.py
, with all the needed functions (step
, reset
, ...). The name of the class environment is MyEnv
, and you want to add it to the classic_control
folder. You have to
- 将
myenv.py
文件放在gym/gym/envs/classic_control
添加到
__init__.py
(位于同一文件夹中)
- Place
myenv.py
file ingym/gym/envs/classic_control
Add to
__init__.py
(located in the same folder)
fromgym.envs.classic_control.myenv import MyEnv
通过添加
gym.envs.register(
id='MyEnv-v0',
entry_point='gym.envs.classic_control:MyEnv',
max_episode_steps=1000,
)
在注册时,您还可以添加 reward_threshold
和 kwargs
(如果您的类需要一些参数).
您也可以直接在您将运行的脚本(TRPO、PPO 或其他)中注册环境,而不是在 gym/gym/envs/__init__.py
中进行.
At registration, you can also add reward_threshold
and kwargs
(if your class takes some arguments).
You can also directly register the environment in the script you will run (TRPO, PPO, or whatever) instead of doing it in gym/gym/envs/__init__.py
.
编辑
这是创建 LQR 环境的最小示例.
This is a minimal example to create the LQR environment.
将下面的代码保存在lqr_env.py
中,放到gym的classic_control文件夹中.
Save the code below in lqr_env.py
and place it in the classic_control folder of gym.
import gym
from gym import spaces
from gym.utils import seeding
import numpy as np
class LqrEnv(gym.Env):
def __init__(self, size, init_state, state_bound):
self.init_state = init_state
self.size = size
self.action_space = spaces.Box(low=-state_bound, high=state_bound, shape=(size,))
self.observation_space = spaces.Box(low=-state_bound, high=state_bound, shape=(size,))
self._seed()
def _seed(self, seed=None):
self.np_random, seed = seeding.np_random(seed)
return [seed]
def _step(self,u):
costs = np.sum(u**2) + np.sum(self.state**2)
self.state = np.clip(self.state + u, self.observation_space.low, self.observation_space.high)
return self._get_obs(), -costs, False, {}
def _reset(self):
high = self.init_state*np.ones((self.size,))
self.state = self.np_random.uniform(low=-high, high=high)
self.last_u = None
return self._get_obs()
def _get_obs(self):
return self.state
将 fromgym.envs.classic_control.lqr_env import LqrEnv
添加到 __init__.py
(也在 classic_control 中).
Add from gym.envs.classic_control.lqr_env import LqrEnv
to __init__.py
(also in classic_control).
在你的脚本中,当你创建环境时,做
In your script, when you create the environment, do
gym.envs.register(
id='Lqr-v0',
entry_point='gym.envs.classic_control:LqrEnv',
max_episode_steps=150,
kwargs={'size' : 1, 'init_state' : 10., 'state_bound' : np.inf},
)
env = gym.make('Lqr-v0')
这篇关于如何在 OpenAI 的健身房中注册自定义环境?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!