Hello World: Your First Agent

With the minerl package installed on your system you can now make your first agent in Minecraft!

To get started, let’s first import the necessary packages

import gym
import minerl

Creating an environment

Now we can choose any one of the many environments included in the minerl package. To learn more about the environments checkout the environment documentation.

For this tutorial we’ll choose the MineRLBasaltFindCave-v0 environment. In this task, the agent is placed to a new world and its (subjective) goal is to find a cave, and end the episode.

To create the environment, simply invoke gym.make

env = gym.make('MineRLBasaltFindCave-v0')


Currently minerl only supports environment rendering in headed environments (servers with monitors attached).

In order to run minerl environments without a head use a software renderer such as xvfb:

xvfb-run python3 <your_script.py>


If you’re worried and want to make sure something is happening behind the scenes install a logger before you create the envrionment.

import logging

env = gym.make('MineRLBasaltFindCave-v0')

Taking actions

As a warm up let’s create a random agent. 🧠

Now we can reset this environment to its first position and get our first observation from the agent by resetting the environment.

# Note that this command will launch the MineRL environment, which takes time.
# Be patient!
obs = env.reset()

The obs variable will be a dictionary containing the following observations returned by the environment. In the case of the MineRLBasaltFindCave-v0 environment, only one observation is returned: pov, an RGB image of the agent’s first person perspective.

    'pov': array([[[ 63,  63,  68],
        [ 63,  63,  68],
        [ 63,  63,  68],
        [ 92,  92, 100],
        [ 92,  92, 100],
        [ 92,  92, 100]],,


        [[ 95, 118, 176],
        [ 95, 119, 177],
        [ 96, 119, 178],
        [ 93, 116, 172],
        [ 93, 115, 171],
        [ 92, 115, 170]]], dtype=uint8)

Now let’s take actions through the environment until time runs out or the agent dies. To do this, we will use the normal OpenAI Gym env.step method.

done = False

while not done:
    # Take a random action
    action = env.action_space.sample()
    # In BASALT environments, sending ESC action will end the episode
    # Lets not do that
    action["ESC"] = 0
    obs, reward, done, _ = env.step(action)

With the env.render call, you should see the agent move sporadically until done flag is set to true, which will happen when agent runs out of time (3 minutes in the FindCave task).