This section is to make you get friend;y with the basic working of the environment.
In the previous section, we reset the game and we were returned the
obs with 4 values:
-0.01258566 is the position of the cart.
-0.00156614 is the velocity of the cart.
0.04207708 is the angle of the pole.
-0.00180545 is the angular velocity of the pole.
We observe that the angle(third value above) is
0.04207708 which is positive(
0.04207708>0), which means the pole is slanting towards the right.
Thus the agent needs to move the cart toward the right. Thus the action for the next step should be
We are going to code the same below.
Prefer to write the following code snippets all in a single code-cell, so that the same could be executed continuously for you to play with it.
If angle>0, action should be 1. Else if angle<0, action is 0. Initially, since the angle of the pole is positive(slanting towards the right), the action should be 1:
Perform a step and get the return values:
obs, reward, done, info = env.step(action)
Now, observe the angle value(3rd value of
obs) and change the value of
1 accordingly. If angle>0, action should be 1. Else if angle<0, action is 0. Play the game and observe the game.
We can also print the
info in a separate code cell to see how they look like:
print(reward) print(done) print(info)
No hints are availble for this assesment
Answer is not availble for this assesment
Note - Having trouble with the assessment engine? Follow the steps listed here