Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

Sorted by: relevance · author · university · dateNew search

You searched for +publisher:"Delft University of Technology" +contributor:("Moerland, Thomas"). Showing records 1 – 3 of 3 total matches.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters


Delft University of Technology

1. Deichler, Anna (author). Generalization and locality in the AlphaZero algorithm: A study in single agent, fully observable, deterministic environments.

Degree: 2019, Delft University of Technology

Recently, the AlphaGo algorithm has managed to defeat the top level human player in the game of Go. Achieving professional level performance in the game of Go has long been considered as an AI milestone. The challenging properties of high state-space complexity, long reward horizon and high action branching factor in the game of Go are also shared by many other complex planning problems, such as robotics applications. This makes the algorithmic solutions of AlphaGo particularly interesting for further research. One of the key innovations in the algorithm is the combination of Monte Carlo tree search (MCTS) with deep learning. The main hypothesis of the thesis is that the success of the algorithm can be attributed to the combination of the generalization capacity of deep neural networks and the local information of tree search. This hypothesis is evaluated through the application of the AlphaZero algorithm (extension of Alphago) in single-player, deterministic and fully-observable reinforcement learning environments. The thesis presents answers to two research questions. First, what changes need to be made to transfer the AlphaZero algorithm to these environments. The changes in the reward support in these new environments can cause failure of learning, since assumption in the MCTS algorithm are violated. The thesis offers solutions that deal with this problem, including adaptive return normalization. The second research question examines what is the relative importance of the locality and generalization in the performance of the AlphaZero algorithm. This research question is answered by comparing the performance of search trees of varying sizes in several RL environments under fixed time budgets. While building small trees support generalization, through allowing more frequent training of the neural network, building larger trees provide more accurate estimates. This creates a trade-off between improved generalization capacity and more accurate local information under a fixed time budget. The experiment results show that mid-size trees achieve the best performance, which suggests that balancing local information and generalization is key to the success of the algorithm. Based on this results, possible extensions to the algorithm are proposed. At last, the thesis also highlights the relevance of the two-component system from a broader perspective and discusses the possible future impact of the algorithm.

Mechanical Engineering | Systems and Control

Advisors/Committee Members: Moerland, Thomas (mentor), Baldi, Simone (graduation committee), Delft University of Technology (degree granting institution).

Subjects/Keywords: Reinforcement learning; Monte Carlo Tree Search; Deep Learning

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Deichler, A. (. (2019). Generalization and locality in the AlphaZero algorithm: A study in single agent, fully observable, deterministic environments. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:b922fec8-8085-4757-a91c-c68d840c03bd

Chicago Manual of Style (16th Edition):

Deichler, Anna (author). “Generalization and locality in the AlphaZero algorithm: A study in single agent, fully observable, deterministic environments.” 2019. Masters Thesis, Delft University of Technology. Accessed January 20, 2021. http://resolver.tudelft.nl/uuid:b922fec8-8085-4757-a91c-c68d840c03bd.

MLA Handbook (7th Edition):

Deichler, Anna (author). “Generalization and locality in the AlphaZero algorithm: A study in single agent, fully observable, deterministic environments.” 2019. Web. 20 Jan 2021.

Vancouver:

Deichler A(. Generalization and locality in the AlphaZero algorithm: A study in single agent, fully observable, deterministic environments. [Internet] [Masters thesis]. Delft University of Technology; 2019. [cited 2021 Jan 20]. Available from: http://resolver.tudelft.nl/uuid:b922fec8-8085-4757-a91c-c68d840c03bd.

Council of Science Editors:

Deichler A(. Generalization and locality in the AlphaZero algorithm: A study in single agent, fully observable, deterministic environments. [Masters Thesis]. Delft University of Technology; 2019. Available from: http://resolver.tudelft.nl/uuid:b922fec8-8085-4757-a91c-c68d840c03bd


Delft University of Technology

2. Cherici, Teo (author). Robotic Auxiliary Losses for continuous reinforcement learning.

Degree: 2018, Delft University of Technology

Recent advancements in computation power and artificial intelligence have allowed the creation of advanced reinforcement learning models which could revolutionize, between others, the field of robotics. As model and environment complexity increase, however, training solely through the feedback of environment reward becomes more difficult. From the work on robotic priors by R.Jonschkowski et al. we present robotic auxiliary losses for continuous reinforcement learning models. These function as additional feedback based on physics principles such as Newton’s laws of motion, to be utilized by the reinforcement learning model during training in robotic environments. We furthermore explore the issues of concurrent optimization on several losses and present a continuous loss normalization method for the balancing of training effort between main and auxiliary losses. In all continuous robotic environments tested, individual robotic auxiliary losses show consistent improvement over the base reinforcement learning model. The joint application of all losses during training however did not always guarantee performance improvements, as the concurrent optimization of several losses of different nature proved to be difficult.

Biomechanical Design | BioRobotics

Advisors/Committee Members: Moerland, Thomas (mentor), Jonker, Pieter (mentor), Delft University of Technology (degree granting institution).

Subjects/Keywords: Reinforcement Learning; Loss Functions; Robotics

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Cherici, T. (. (2018). Robotic Auxiliary Losses for continuous reinforcement learning. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:82fb85b9-9b68-4f05-a89b-53e82c593194

Chicago Manual of Style (16th Edition):

Cherici, Teo (author). “Robotic Auxiliary Losses for continuous reinforcement learning.” 2018. Masters Thesis, Delft University of Technology. Accessed January 20, 2021. http://resolver.tudelft.nl/uuid:82fb85b9-9b68-4f05-a89b-53e82c593194.

MLA Handbook (7th Edition):

Cherici, Teo (author). “Robotic Auxiliary Losses for continuous reinforcement learning.” 2018. Web. 20 Jan 2021.

Vancouver:

Cherici T(. Robotic Auxiliary Losses for continuous reinforcement learning. [Internet] [Masters thesis]. Delft University of Technology; 2018. [cited 2021 Jan 20]. Available from: http://resolver.tudelft.nl/uuid:82fb85b9-9b68-4f05-a89b-53e82c593194.

Council of Science Editors:

Cherici T(. Robotic Auxiliary Losses for continuous reinforcement learning. [Masters Thesis]. Delft University of Technology; 2018. Available from: http://resolver.tudelft.nl/uuid:82fb85b9-9b68-4f05-a89b-53e82c593194


Delft University of Technology

3. Moring, Stefan (author). Kinodynamic Steering using Supervised Learning in RRT.

Degree: 2018, Delft University of Technology

With the need for robots to operate autonomously increasing more and more, the research field of motion planning is becoming more active. Usually planning is done in configuration space, which often leads to non feasible solutions for highly dynamical or underactuated systems. With kinodynamic planning motion can also be planned for this difficult class of systems. However, due to the difficult nature of the problem, computation time is an issue. RRT CoLearn is a novel variant on the original RRT algorithm that tries to decrease computation time by replacing computational heavy steps in the algorithm with supervised learning. In this thesis the performance of RRT CoLearn is investigated, and it is found that it does not work on multi-DOF systems. Furthermore a novel steering function is presented called Inverse Dynamics Learning, which is shown to converge over five times faster than RRT CoLearn and also converge on a highly non-linear 2-DOF system.

Science in Signals & Systems

Advisors/Committee Members: Wisse, Martijn (mentor), Bharatheesha, Mukunda (mentor), Spaan, Matthijs (graduation committee), Alonso Mora, Javier (graduation committee), Moerland, Thomas (graduation committee), Delft University of Technology (degree granting institution).

Subjects/Keywords: Motion Planning; RRT; Kinodynamic; Supervised Learning

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Moring, S. (. (2018). Kinodynamic Steering using Supervised Learning in RRT. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:fdf13674-f2b5-4b4a-b03d-21299114642a

Chicago Manual of Style (16th Edition):

Moring, Stefan (author). “Kinodynamic Steering using Supervised Learning in RRT.” 2018. Masters Thesis, Delft University of Technology. Accessed January 20, 2021. http://resolver.tudelft.nl/uuid:fdf13674-f2b5-4b4a-b03d-21299114642a.

MLA Handbook (7th Edition):

Moring, Stefan (author). “Kinodynamic Steering using Supervised Learning in RRT.” 2018. Web. 20 Jan 2021.

Vancouver:

Moring S(. Kinodynamic Steering using Supervised Learning in RRT. [Internet] [Masters thesis]. Delft University of Technology; 2018. [cited 2021 Jan 20]. Available from: http://resolver.tudelft.nl/uuid:fdf13674-f2b5-4b4a-b03d-21299114642a.

Council of Science Editors:

Moring S(. Kinodynamic Steering using Supervised Learning in RRT. [Masters Thesis]. Delft University of Technology; 2018. Available from: http://resolver.tudelft.nl/uuid:fdf13674-f2b5-4b4a-b03d-21299114642a

.