top of page

Results and End Goals

Due to time constraints, we were unable to fully train our networks to convergence. However, we show that we can recover the original reward function. We also provide working demonstrations along with evidence that the network could improve with further training.

We fix the target and box as shown. We then plot the learned reward as the coordinates of the Turtlebot are changed. Note that the system has learned that it’s best to first maneuver to a position aligned with the box orientation.

We fix the target, but now plot the learned reward as the coordinates of both the Turtlebot and box are changed. Note that the system has learned that we can directly maneuver to the target if already grasping the box.

 

A “greedy” policy which chooses the mode of the controller network as the intention, then executes the mode of the policy network for that intention.

 

A “balanced” policy which takes the average of the policy network modes, weighted by the probability intentions under the controller. The imperfections as compared to the deterministic policy suggests that the controller network would improve with more training.

Expert Demonstration

Learned Policy Demonstration

Future Work

As with many machine learning projects, training and tuning are art forms. There are many tricks that we can apply to help train the GAN and the network to convergence which unfortunately were not in scope for this project, but that we would like to try in the future. In addition, we would like to make the box-pushing task fully closed-loop by tracking the box location via an AR tag or other means. Currently the system is rather brittle, as we manually specify the number of time steps after the Turtlebot approaches the box, after which we should "grasp" the box. Finally, we would like to try more complex and varied tasks and attempt to create an entirely new supertask (for which we have no expert demonstrations) out of the component intentions of these tasks.

©2018 BY THE THREE LITTLE TURTLEBOTS. PROUDLY CREATED WITH WIX.COM

bottom of page