Cliff Walking with Q-Learning NetLogo Extension

Cliff Walking with Q-Learning NetLogo Extension (version 1.0.0)

This model implements a classic scenario used in Reinforcement Learning problem, the “Cliff Walking Problem”. Consider the gridworld shown below. This is a standard undiscounted, episodic task, with start and goal states, and the usual actions causing movement up, down, right, and left. Reward is -1 on all transitions except those into the region marked “The Cliff.” Stepping into this region incurs a reward of -100 and sends the agent instantly back to the start (SUTTON; BARTO, 2018).

![](upload://k45aunzIIdmxnRIS0GzkM1Zg5V3.jpeg)

The problem is solved in this model using the Q-Learning algorithm. The algorithm is implemented with the support of the NetLogo Q-Learning Extension

Release Notes

This model implements Q-learning (Watkins 1989) a one-step temporal difference algorithm in the area of reinforcement learning, a branch of artificial intelligence and machine learning.

The scenario used is a classic Reinforcement Learning problem, the “Cliff Walking”. Consider the gridworld shown below. This is a standard undiscounted, episodic task, with start and goal states, and the usual actions causing movement up, down, right, and left. Reward is -1 on all transitions except those into the region marked “The Cliff.” Stepping into this region incurs a reward of -100 and sends the agent instantly back to the start (SUTTON; BARTO, 2018).

![](upload://k45aunzIIdmxnRIS0GzkM1Zg5V3.jpeg)

This implementation makes use of the Q-Learning NetLogo Extension to implement the Q-Learning algorithm.


This is a companion discussion topic for the original entry at https://www.comses.net/codebases/b938a820-f209-4648-afc6-0946657c3484/releases/1.0.0/

The downloaded file contains only one file: codemeta.json. Where can I find the NetLogo source code? Thanks!

Thanks for pointing this out! We’ve rebuilt the archival package and you should find the code and documentation in the download now. There appears to have been a bug in our system and it didn’t build it correctly the first time around from @kevinkons original submission.