Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

Sorted by: relevance · author · university · dateNew search

You searched for subject:(Upper Confidence Bounds). Showing records 1 – 3 of 3 total matches.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters


California State University – Northridge

1. Shahbazian, Sarmen. Monte Carlo tree search in Einstein Wurfelt Niccht!.

Degree: MS, Department of Computer Science, 2012, California State University – Northridge

Monte Carlo simulations have been often used for artificial intelligence with many games. When combined with a search tree and millions of random simulations per second, it can produce a recommended move according to the simulation strategy. But, results can be different for each type of game. The structure and search formula for the tree must be tailored differently for each type of game. In this project, I have created a program called SteinBot for playing Einstein Wurfelt Nicht, which used an improved version of the MCTS algorithm to produce a recommended move. The MCTS was created in a special way to support die values. SteinBot was tested with hundreds of games with the random method and confirmed that improvements are needed for the random simulations and search formula. SteinBot played against many humans with the improved strategy and won 63% of the games. Advisors/Committee Members: Lorentz, Richard J. (advisor), Barnes, George Michael (committee member).

Subjects/Keywords: Upper confidence bounds applied to trees; Einstein Wurfelt Nicht; Dissertations, Academic  – CSUN  – Computer Science.

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Shahbazian, S. (2012). Monte Carlo tree search in Einstein Wurfelt Niccht!. (Masters Thesis). California State University – Northridge. Retrieved from http://hdl.handle.net/10211.2/1612

Chicago Manual of Style (16th Edition):

Shahbazian, Sarmen. “Monte Carlo tree search in Einstein Wurfelt Niccht!.” 2012. Masters Thesis, California State University – Northridge. Accessed December 15, 2019. http://hdl.handle.net/10211.2/1612.

MLA Handbook (7th Edition):

Shahbazian, Sarmen. “Monte Carlo tree search in Einstein Wurfelt Niccht!.” 2012. Web. 15 Dec 2019.

Vancouver:

Shahbazian S. Monte Carlo tree search in Einstein Wurfelt Niccht!. [Internet] [Masters thesis]. California State University – Northridge; 2012. [cited 2019 Dec 15]. Available from: http://hdl.handle.net/10211.2/1612.

Council of Science Editors:

Shahbazian S. Monte Carlo tree search in Einstein Wurfelt Niccht!. [Masters Thesis]. California State University – Northridge; 2012. Available from: http://hdl.handle.net/10211.2/1612


NSYSU

2. Chiang, Hsuan-yi. Action Segmentation and Learning by Inverse Reinforcement Learning.

Degree: Master, Electrical Engineering, 2015, NSYSU

Reinforcement learning allows agents to learn behaviors through trial and error. However, as the level of difficulty increases, the reward function of the mission also becomes harder to be defined. By combining the concepts of Adaboost classifier and Upper Confidence Bounds (UCB), a method based on inverse reinforcement learning is proposed to construct the reward function of a complex mission. Inverse reinforcement learning allows the agent to rebuild a reward function that imitates the process of interaction between the expert and the environment. During the imitation, the agent continuously compares the difference between the expert and itself, and then the proposed methods determines a specific weight for each state via Adaboost. The weight is then combined with the state confidence from UCB to construct an approximate reward function. This thesis uses a state encoding method and action segmentation to simplify the problem, then utilize the proposed method to determine the optimal reward function. Finally, a maze environment and a soccer robot environment simulation are used to validate the proposed method, further to decreasing the computational time. Advisors/Committee Members: Ming-Yi Ju (chair), Yu-Jen Chen (chair), Kao-Shing Hwang (committee member), Jin-Ling Lin (chair).

Subjects/Keywords: Upper Confidence Bounds; Adaboost classifier; reward function; Inverse Reinforcement learning; Reinforcement learning

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Chiang, H. (2015). Action Segmentation and Learning by Inverse Reinforcement Learning. (Thesis). NSYSU. Retrieved from http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0906115-151230

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

Chiang, Hsuan-yi. “Action Segmentation and Learning by Inverse Reinforcement Learning.” 2015. Thesis, NSYSU. Accessed December 15, 2019. http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0906115-151230.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

Chiang, Hsuan-yi. “Action Segmentation and Learning by Inverse Reinforcement Learning.” 2015. Web. 15 Dec 2019.

Vancouver:

Chiang H. Action Segmentation and Learning by Inverse Reinforcement Learning. [Internet] [Thesis]. NSYSU; 2015. [cited 2019 Dec 15]. Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0906115-151230.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

Chiang H. Action Segmentation and Learning by Inverse Reinforcement Learning. [Thesis]. NSYSU; 2015. Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0906115-151230

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

3. Piette, Eric. Une nouvelle approche au General Game Playing dirigée par les contraintes : A stochastic constraint-based approach to General Game Playing.

Degree: Docteur es, Informatique, 2016, Université d'Artois

Développer un programme capable de jouer à n’importe quel jeu de stratégie, souvent désigné par le General Game Playing (GGP) constitue un des Graal de l’intelligence artificielle. Les compétitions GGP, où chaque jeu est représenté par un ensemble de règles logiques au travers du Game Description Language (GDL), ont conduit la recherche à confronter de nombreuses approches incluant les méthodes de type Monte Carlo, la construction automatique de fonctions d’évaluation, ou la programmation logique et ASP. De par cette thèse, nous proposons une nouvelle approche dirigée par les contraintes stochastiques.Dans un premier temps, nous nous concentrons sur l’élaboration d’une traduction de GDL en réseauxde contraintes stochastiques (SCSP) dans le but de fournir une représentation dense des jeux de stratégies et permettre la modélisation de stratégies.Par la suite, nous exploitons un fragment de SCSP au travers d’un algorithme dénommé MAC-UCBcombinant l’algorithme MAC (Maintaining Arc Consistency) utilisé pour résoudre chaque niveau duSCSP tour après tour, et à l’aide de UCB (Upper Confidence Bound) afin d’estimer l’utilité de chaquestratégie obtenue par le dernier niveau de chaque séquence. L’efficacité de cette nouvelle technique sur les autres approches GGP est confirmée par WoodStock, implémentant MAC-UCB, le leader actuel du tournoi continu de GGP.Finalement, dans une dernière partie, nous proposons une approche alternative à la détection de symétries dans les jeux stochastiques, inspirée de la programmation par contraintes. Nous montrons expérimentalement que cette approche couplée à MAC-UCB, surpasse les meilleures approches du domaine et a permis à WoodStock de devenir champion GGP 2016.

The ability for a computer program to effectively play any strategic game, often referred to General Game Playing (GGP), is a key challenge in AI. The GGP competitions, where any game is represented according to a set of logical rules in the Game Description Language (GDL), have led researches to compare various approaches, including Monte Carlo methods, automatic constructions of evaluation functions, logic programming, and answer set programming through some general game players. In this thesis, we offer a new approach driven by stochastic constraints. We first focus on a translation process from GDL to stochastic constraint networks (SCSP) in order to provide compact representations of strategic games and to model strategies. In a second part, we exploit a fragment of SCSP through an algorithm called MAC-UCB by coupling the MAC (Maintaining Arc Consistency) algorithm, used to solve each stage of the SCSP in turn, together with the UCB (Upper Confidence Bound) policy for approximating the values of those strategies obtained by the last stage in the sequence. The efficiency of this technical on the others GGP approaches is confirmed by WoodStock, implementing MAC-UCB, the actual leader on the GGP Continuous Tournament. Finally, in the last part, we propose an alternative approach to symmetry detection in stochastic games,…

Advisors/Committee Members: Lagrue, Sylvain (thesis director), Koriche, Frédéric (thesis director).

Subjects/Keywords: Théorie des jeux; Apprentissage automatique; Programmation par contraintes; Représentation des connaissances; Game theory; Machine learning; Contraint programming; Knowledge representation; CSP stochastique; Upper confidence bounds; 006.3

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Piette, E. (2016). Une nouvelle approche au General Game Playing dirigée par les contraintes : A stochastic constraint-based approach to General Game Playing. (Doctoral Dissertation). Université d'Artois. Retrieved from http://www.theses.fr/2016ARTO0401

Chicago Manual of Style (16th Edition):

Piette, Eric. “Une nouvelle approche au General Game Playing dirigée par les contraintes : A stochastic constraint-based approach to General Game Playing.” 2016. Doctoral Dissertation, Université d'Artois. Accessed December 15, 2019. http://www.theses.fr/2016ARTO0401.

MLA Handbook (7th Edition):

Piette, Eric. “Une nouvelle approche au General Game Playing dirigée par les contraintes : A stochastic constraint-based approach to General Game Playing.” 2016. Web. 15 Dec 2019.

Vancouver:

Piette E. Une nouvelle approche au General Game Playing dirigée par les contraintes : A stochastic constraint-based approach to General Game Playing. [Internet] [Doctoral dissertation]. Université d'Artois; 2016. [cited 2019 Dec 15]. Available from: http://www.theses.fr/2016ARTO0401.

Council of Science Editors:

Piette E. Une nouvelle approche au General Game Playing dirigée par les contraintes : A stochastic constraint-based approach to General Game Playing. [Doctoral Dissertation]. Université d'Artois; 2016. Available from: http://www.theses.fr/2016ARTO0401

.