Project - Reinforcement Learning - How to make computer learn to play CartPole game

10 / 24

Let's make the agent play the game!

  • This section is to make you get friend;y with the basic working of the environment.

  • In the previous section, we reset the game and we were returned the obs with 4 values:

    • First value -0.01258566 is the position of the cart.

    • Second value -0.00156614 is the velocity of the cart.

    • Third value 0.04207708 is the angle of the pole.

    • Fourth value -0.00180545 is the angular velocity of the pole.

  • We observe that the angle(third value above) is 0.04207708 which is positive(0.04207708>0), which means the pole is slanting towards the right.

  • Thus the agent needs to move the cart toward the right. Thus the action for the next step should be 1.

  • We are going to code the same below.


Prefer to write the following code snippets all in a single code-cell, so that the same could be executed continuously for you to play with it.

  • If angle>0, action should be 1. Else if angle<0, action is 0. Initially, since the angle of the pole is positive(slanting towards the right), the action should be 1:

  • Perform a step and get the return values:

    obs, reward, done, info = env.step(action)
  • Print obs.


    Now, observe the angle value(3rd value of obs) and change the value of action to 0 or 1 accordingly. If angle>0, action should be 1. Else if angle<0, action is 0. Play the game and observe the game.

  • We can also print the reward, done, and info in a separate code cell to see how they look like:

Get Hint

Answer is not availble for this assesment

Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...