Login using Social Account
     Continue with GoogleLogin using your credentials
This section is to make you get friend;y with the basic working of the environment.
In the previous section, we reset the game and we were returned the obs
with 4 values:
First value -0.01258566
is the position of the cart.
Second value -0.00156614
is the velocity of the cart.
Third value 0.04207708
is the angle of the pole.
Fourth value -0.00180545
is the angular velocity of the pole.
We observe that the angle(third value above) is 0.04207708
which is positive(0.04207708>0
), which means the pole is slanting towards the right.
Thus the agent needs to move the cart toward the right. Thus the action for the next step should be 1
.
We are going to code the same below.
Prefer to write the following code snippets all in a single code-cell, so that the same could be executed continuously for you to play with it.
If angle>0, action should be 1. Else if angle<0, action is 0. Initially, since the angle of the pole is positive(slanting towards the right), the action should be 1:
action=1
Perform a step and get the return values:
obs, reward, done, info = env.step(action)
Print obs
.
print(obs)
Now, observe the angle value(3rd value of obs
) and change the value of action
to 0
or 1
accordingly. If angle>0, action should be 1. Else if angle<0, action is 0. Play the game and observe the game.
We can also print the reward
, done
, and info
in a separate code cell to see how they look like:
print(reward)
print(done)
print(info)
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
Answer is not availble for this assesment
Note - Having trouble with the assessment engine? Follow the steps listed here
Loading comments...