You searched for subject:(POMDP)
.
Showing records 1 – 30 of
88 total matches.
◁ [1] [2] [3] ▶

University of Illinois – Chicago
1.
Olivo, Iacopo.
Solving Interactive POMDPS in Julia.
Degree: 2019, University of Illinois – Chicago
URL: http://hdl.handle.net/10027/23839
► Acting optimally in partially observable multi-agent stochastic domains is a growing topic in the Artificial Intelligence community and several solutions have been proposed. InteractivePOMDPs are…
(more)
▼ Acting optimally in partially observable multi-agent stochastic domains is a growing topic
in the Artificial Intelligence community and several solutions have been proposed. InteractivePOMDPs are one of the most complete solutions but it is highly susceptible to course of
dimensionality and course of history. Several teams are proposing algorithms to overcome these
difficulties.
In this work it is proposed the framework Julia.POMDPs in order to standardize the development and testing of such solving algorithms (Chapter 3). The framework introduces ways to
declare I-
POMDP agents and how to define an agent hierarchy. Julia.IPOMDPs takes advantage of Julia.POMDPs to provide solutions to POMDPs. This project also proposes a new way
to solve I-POMDPs by reducing them to POMDPs (Chapter 4) and solving them by means of
Julia.POMDPs. However, the solver could not provide the expected precision due to loss of
information in the conversion. The on-line solver is tested by the redefinition of the multi-agent
tiger game. Tests and results are analyzed in Chapter 5. The tests are used in order to show
the simplicity of defining several different frames and problem setups.
Advisors/Committee Members: Gmytrasiewicz, Piotr (advisor), Piccolo, Elio (committee member), Ziebart, Brian (committee member), Gmytrasiewicz, Piotr (chair).
Subjects/Keywords: POMDP; I-POMDP; Julia; IPOMDPs
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Olivo, I. (2019). Solving Interactive POMDPS in Julia. (Thesis). University of Illinois – Chicago. Retrieved from http://hdl.handle.net/10027/23839
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Olivo, Iacopo. “Solving Interactive POMDPS in Julia.” 2019. Thesis, University of Illinois – Chicago. Accessed March 01, 2021.
http://hdl.handle.net/10027/23839.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Olivo, Iacopo. “Solving Interactive POMDPS in Julia.” 2019. Web. 01 Mar 2021.
Vancouver:
Olivo I. Solving Interactive POMDPS in Julia. [Internet] [Thesis]. University of Illinois – Chicago; 2019. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/10027/23839.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Olivo I. Solving Interactive POMDPS in Julia. [Thesis]. University of Illinois – Chicago; 2019. Available from: http://hdl.handle.net/10027/23839
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Mississippi State University
2.
Marwah, Gaurav.
ALGORITHMS FOR STOCHASTIC FINITE MEMORY CONTROL OF PARTIALLY OBSERVABLE SYSTEMS.
Degree: MS, Computer Science, 2005, Mississippi State University
URL: http://sun.library.msstate.edu/ETD-db/theses/available/etd-07082005-132056/
;
► A partially observable Markov decision process (POMDP) is a mathematical framework for planning and control problems in which actions have stochastic effects and observations provide…
(more)
▼ A partially observable Markov decision process (
POMDP) is a mathematical
framework for planning and control problems in which actions have
stochastic effects and observations provide uncertain state
information. It is widely used for research in decision-theoretic
planning and reinforcement learning. %
To cope with partial observability, a policy (or plan) must use
memory, and previous work has shown that a finite-state controller
provides a good policy representation. This thesis considers a
previously-developed bounded policy iteration algorithm for POMDPs
that finds policies that take the form of stochastic finite-state
controllers. Two new improvements of this algorithm are developed.
First improvement provides a simplification of the basic linear
program, which is used to find improved controllers. This results in
a considerable speed-up in efficiency of the original algorithm.
Secondly, a branch and bound algorithm for adding the best possible
node to the controller is presented, which provides an error bound
and a test for global optimality. Experimental results show that
these enhancements significantly improve the algorithm's
performance.
Advisors/Committee Members: Eric A. Hansen (chair).
Subjects/Keywords: POMDP
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Marwah, G. (2005). ALGORITHMS FOR STOCHASTIC FINITE MEMORY CONTROL OF PARTIALLY OBSERVABLE SYSTEMS. (Masters Thesis). Mississippi State University. Retrieved from http://sun.library.msstate.edu/ETD-db/theses/available/etd-07082005-132056/ ;
Chicago Manual of Style (16th Edition):
Marwah, Gaurav. “ALGORITHMS FOR STOCHASTIC FINITE MEMORY CONTROL OF PARTIALLY OBSERVABLE SYSTEMS.” 2005. Masters Thesis, Mississippi State University. Accessed March 01, 2021.
http://sun.library.msstate.edu/ETD-db/theses/available/etd-07082005-132056/ ;.
MLA Handbook (7th Edition):
Marwah, Gaurav. “ALGORITHMS FOR STOCHASTIC FINITE MEMORY CONTROL OF PARTIALLY OBSERVABLE SYSTEMS.” 2005. Web. 01 Mar 2021.
Vancouver:
Marwah G. ALGORITHMS FOR STOCHASTIC FINITE MEMORY CONTROL OF PARTIALLY OBSERVABLE SYSTEMS. [Internet] [Masters thesis]. Mississippi State University; 2005. [cited 2021 Mar 01].
Available from: http://sun.library.msstate.edu/ETD-db/theses/available/etd-07082005-132056/ ;.
Council of Science Editors:
Marwah G. ALGORITHMS FOR STOCHASTIC FINITE MEMORY CONTROL OF PARTIALLY OBSERVABLE SYSTEMS. [Masters Thesis]. Mississippi State University; 2005. Available from: http://sun.library.msstate.edu/ETD-db/theses/available/etd-07082005-132056/ ;

Cornell University
3.
Thorbergsson, Leifur.
Experimental Design For Partially Observed Markov Decision Processes.
Degree: PhD, Statistics, 2014, Cornell University
URL: http://hdl.handle.net/1813/38999
► This thesis considers the question of how to most effectively conduct experiments in Partially Observed Markov Decision Processes so as to provide data that is…
(more)
▼ This thesis considers the question of how to most effectively conduct experiments in Partially Observed Markov Decision Processes so as to provide data that is most informative about a parameter of interest. Methods from Markov decision processes, especially dynamic programming, are introduced and then used in algorithms to maximize a relevant Fisher Information. These algorithms are then applied to two
POMDP examples. The methods developed can also be applied to stochastic dynamical systems, by suitable discretization, and we consequently show what control policies look like in the Morris-Lecar Neuron model and the Rosenzweig MacArthur Model, and simulation results are presented. We discuss how parameter dependence within these methods can be dealt with by the use of priors, and develop tools to update control policies online. This is demonstrated in another stochastic dynamical system describing growth dynamics of DNA template in a PCR model.
Advisors/Committee Members: Hooker, Giles J. (chair), Turnbull, Bruce William (committee member), Booth, James (committee member).
Subjects/Keywords: Experimental Design; POMDP; Diffusion processes
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Thorbergsson, L. (2014). Experimental Design For Partially Observed Markov Decision Processes. (Doctoral Dissertation). Cornell University. Retrieved from http://hdl.handle.net/1813/38999
Chicago Manual of Style (16th Edition):
Thorbergsson, Leifur. “Experimental Design For Partially Observed Markov Decision Processes.” 2014. Doctoral Dissertation, Cornell University. Accessed March 01, 2021.
http://hdl.handle.net/1813/38999.
MLA Handbook (7th Edition):
Thorbergsson, Leifur. “Experimental Design For Partially Observed Markov Decision Processes.” 2014. Web. 01 Mar 2021.
Vancouver:
Thorbergsson L. Experimental Design For Partially Observed Markov Decision Processes. [Internet] [Doctoral dissertation]. Cornell University; 2014. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/1813/38999.
Council of Science Editors:
Thorbergsson L. Experimental Design For Partially Observed Markov Decision Processes. [Doctoral Dissertation]. Cornell University; 2014. Available from: http://hdl.handle.net/1813/38999

West Virginia University
4.
Nguyen, Jennifer Quyen.
Navigation under Obstacle Motion Uncertainty using Markov Decision Processes.
Degree: MS, Mechanical and Aerospace Engineering, 2020, West Virginia University
URL: https://doi.org/10.33915/etd.7516
;
https://researchrepository.wvu.edu/etd/7516
► In terms of navigation, a central problem in the field of autonomous robotics is obstacle avoidance. This research explores how to navigate as well…
(more)
▼ In terms of navigation, a central problem in the field of autonomous robotics is obstacle avoidance. This research explores how to navigate as well as avoid obstacles by leveraging what is known of the environment to determine decisions with new incoming information during execution. The algorithm presented in this work is divided into two procedures: an offline process that uses prior knowledge to navigate toward the goal; and an online execution strategy that leverages results obtained offline to drive safely towards the target when new information is encountered (e.g., obstacles). To take advantage of what is known offline, the navigation problem was formulated as a Markov Decision Process (MDP) where the environment is characterized as an occupancy grid. Baseline dynamic programming techniques were used to solve this, producing general behaviors that drive the robot (or agent) toward the goal and a value function which encodes the value of being in particular states. Then during online execution, the agent uses these offline results and surrounding local information of the environment to operate (e.g., data from a LIDAR sensor). This locally acquired information, which may contain new data not seen prior, is represented as a small occupancy grid and leverages the offline obtained value function to define local goals allowing the agent to make short term plans. When the agent encounters an obstacle locally, the problem becomes a Partially Observable Markov Decision Process (
POMDP) since it is uncertain where these obstacles will be in the next state. This is solved by utilizing an approximate planner (QMDP) that uses uncertainty of the obstacle motion and considers all possible obstacle state combinations in the next time step to determine the best action. The approximate planner can quickly solve the
POMDP, due to the small size of the local occupancy grid and by using the behaviors produced offline to help speed up convergence, which opens the possibility for this procedure to be executed in real time, on a physical robot. Two simulated environments were created, varying in complexity and dynamic obstacles. Simulation results under complex conditions with narrow operable spaces and many dynamic obstacles show the proposed algorithm has approximately an 85% success rate, in test cases with cluttered environments and multiple dynamic obstacles, and is shown to produce safer trajectories than the baseline approach, which had roughly a 37% success rate, under the assumptions that dynamic obstacles can only move a short distance by the next time step.
Advisors/Committee Members: Yu Gu, Jason Gross.
Subjects/Keywords: navigation; obstacle avoidance; MDP; POMDP; QMDP
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Nguyen, J. Q. (2020). Navigation under Obstacle Motion Uncertainty using Markov Decision Processes. (Thesis). West Virginia University. Retrieved from https://doi.org/10.33915/etd.7516 ; https://researchrepository.wvu.edu/etd/7516
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Nguyen, Jennifer Quyen. “Navigation under Obstacle Motion Uncertainty using Markov Decision Processes.” 2020. Thesis, West Virginia University. Accessed March 01, 2021.
https://doi.org/10.33915/etd.7516 ; https://researchrepository.wvu.edu/etd/7516.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Nguyen, Jennifer Quyen. “Navigation under Obstacle Motion Uncertainty using Markov Decision Processes.” 2020. Web. 01 Mar 2021.
Vancouver:
Nguyen JQ. Navigation under Obstacle Motion Uncertainty using Markov Decision Processes. [Internet] [Thesis]. West Virginia University; 2020. [cited 2021 Mar 01].
Available from: https://doi.org/10.33915/etd.7516 ; https://researchrepository.wvu.edu/etd/7516.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Nguyen JQ. Navigation under Obstacle Motion Uncertainty using Markov Decision Processes. [Thesis]. West Virginia University; 2020. Available from: https://doi.org/10.33915/etd.7516 ; https://researchrepository.wvu.edu/etd/7516
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Waterloo
5.
Shu, Keqi.
Autonomous Driving at Intersections: A Critical-Turning-Point Approach for Planning and Decision Making.
Degree: 2020, University of Waterloo
URL: http://hdl.handle.net/10012/16190
► Left-turning at unsignalized intersection is one of the most challenging tasks for urban automated driving, due to the various shapes of different intersections, and rapidly…
(more)
▼ Left-turning at unsignalized intersection is one of the most challenging tasks for urban automated driving, due to the various shapes of different intersections, and rapidly changing nature of the driving scenarios. Many algorithms including rule-based approach, graph-based approach, optimization-based approach, etc. have been developed to overcome the problems. However, most algorithms implemented were difficult to guarantee the safety at intersection scenarios in real time due to the large uncertainty of the intersections. Other algorithms that aim to always keep a safe distance in all cases often become overly conservative, which might also be dangerous and inefficient.
This thesis addresses this challenge by proposing a generalized critical turning point (CTP) based hierarchical decision making and planning method, which enables safe and efficient planning and decision making of autonomous vehicles. The high-level candidate-paths planner takes the road map information and generates CTPs using a parameterized CTP extraction model which is proposed and verified by naturalistic driving data. CTP is a novel concept and the corresponding CTP model is used to generate behavior-oriented paths that adapt to various intersections. These modifications help to assure the high searching efficiency of the planning process, and in the meanwhile, enable human-like driving behavior of the autonomous vehicle. The low-level planner formulates the decision-making task to a POMDP problem which considers the uncertainties of the agent in the intersections. The POMDP problem is then solved with a Monte Carlo tree search (MCTS)-based framework to select proper candidate paths and decide the actions on that path.
The proposed framework that uses CTPs is tested in several critical scenarios and has out-performed the methods of not using CTPs. The framework has shown the ability to adapt to various shapes of intersections with different numbers of road lanes and different width of median strips, and finishes the left turns while keeping proper safety distances. The uses of the CTP concept which is proposed through human-driving left-turning behaviors, enables the framework to perform human-like behaviors that is easier to be speculated by the other agents of the intersection, which improves the safety of the ego vehicle too. The framework is also capable of personalized modification of the desired real-time performance and the corresponding stability. The use of the POMDP model which considers the unknown intentions of the surrounding vehicles has also enabled the framework to provide commute-efficient two-dimensional planning and decision-making. In all, the proposed framework enables the ego vehicle to perform less conservative and human-like actions while considering the potential of crashes in real-time, which not only improves the commute-efficiency, but also enables urban driving autonomous vehicles to naturally integrate into scenarios with human-driven vehicles in a friendly manner
Subjects/Keywords: autonomous driving; decision making; POMDP; path planning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Shu, K. (2020). Autonomous Driving at Intersections: A Critical-Turning-Point Approach for Planning and Decision Making. (Thesis). University of Waterloo. Retrieved from http://hdl.handle.net/10012/16190
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Shu, Keqi. “Autonomous Driving at Intersections: A Critical-Turning-Point Approach for Planning and Decision Making.” 2020. Thesis, University of Waterloo. Accessed March 01, 2021.
http://hdl.handle.net/10012/16190.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Shu, Keqi. “Autonomous Driving at Intersections: A Critical-Turning-Point Approach for Planning and Decision Making.” 2020. Web. 01 Mar 2021.
Vancouver:
Shu K. Autonomous Driving at Intersections: A Critical-Turning-Point Approach for Planning and Decision Making. [Internet] [Thesis]. University of Waterloo; 2020. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/10012/16190.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Shu K. Autonomous Driving at Intersections: A Critical-Turning-Point Approach for Planning and Decision Making. [Thesis]. University of Waterloo; 2020. Available from: http://hdl.handle.net/10012/16190
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Illinois – Chicago
6.
Yavolovsky, Andrey.
Decision-Theoretic Monitoring of Cyber-Physical Systems.
Degree: 2018, University of Illinois – Chicago
URL: http://hdl.handle.net/10027/22592
► Cyber-physical systems (CPS) represent "engineered systems that are built from, and depend upon, the seamless integration of computational algorithms and physical components". They can be…
(more)
▼ Cyber-physical systems (CPS) represent "engineered systems that are built from, and depend upon, the seamless integration of computational algorithms and physical components". They can be found in such areas as aerospace, manufacturing, transportation, entertainment, healthcare, and automotive. For some of these systems the top priority is correct functioning; incorrect operation of systems like autonomous vehicles, medical devices or aircraft may lead to catastrophic consequences. But formally proving system correctness is a challenging problem. In recent years, runtime monitoring, where a monitor observes the outputs of the system and determines whether a system specification has been violated, has emerged as an attractive alternative.
In this thesis, we focus on the decision-theoretic approach to monitoring of safety properties in CPS. In particular, we formulate the monitoring problem as a Partially Observable Markov Decision Process (
POMDP) whereby deciding whether a run is safe corresponds to executing the optimal policy of the monitoring
POMDP. We show how Monte-Carlo planning algorithm (POMCP) can be used to compute the optimal policy of the monitoring
POMDP. The monitoring
POMDP reward structure is naturally described with four parameters and an important question is how it affects the monitoring performance, quantified through acceptance accuracy, rejection accuracy and monitoring time. We analyze the performance of the
POMDP-based monitors for special choices of the reward structure and compare them with the performance of the traditional threshold-based monitors. Our results show that using POMDPs we can sometimes simultaneously improve both accuracies of the monitor while decreasing the monitoring time.
Further, we study
POMDP-based monitors for systems with terminal strongly connected components. For this class of systems, we derive the expressions for the
POMDP value function. The expressions allow us to demonstrate how the
POMDP-policy can take advantage of the properties of the monitored system and thus provide an enhanced monitoring performance.
In order to study decision-theoretic monitors and experiment with them, we have developed a software called Decision-Theoretic Monitoring Tool (DTMT). This tool implements the architecture that allows easy integration of new monitoring decision rules and experimentation with an arbitrary user-defined system and property models.
Finally, we evaluate
POMDP monitors on an experimental robotic system. We simulate the operation of a simplified transmission system on a mobile robot that was developed specifically for this purpose. The experiments confirm that
POMDP-based monitors provide an improvement over traditional threshold-based approaches. Further, we show that
POMDP-based monitors implemented through POMCP can be used online even for large problems and thus provide an attractive and flexible alternative to traditional threshold-based monitors.
Advisors/Committee Members: Žefran, Miloš (advisor), Sistla, Prasad (advisor), Gmytrasiemicz, Piotr (committee member), Venkatakrishnan, Venkat (committee member), Soltanalian, Mojtaba (committee member), Žefran, Miloš (chair).
Subjects/Keywords: Run-time; Monitoring; Decision Theory; POMDP
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Yavolovsky, A. (2018). Decision-Theoretic Monitoring of Cyber-Physical Systems. (Thesis). University of Illinois – Chicago. Retrieved from http://hdl.handle.net/10027/22592
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Yavolovsky, Andrey. “Decision-Theoretic Monitoring of Cyber-Physical Systems.” 2018. Thesis, University of Illinois – Chicago. Accessed March 01, 2021.
http://hdl.handle.net/10027/22592.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Yavolovsky, Andrey. “Decision-Theoretic Monitoring of Cyber-Physical Systems.” 2018. Web. 01 Mar 2021.
Vancouver:
Yavolovsky A. Decision-Theoretic Monitoring of Cyber-Physical Systems. [Internet] [Thesis]. University of Illinois – Chicago; 2018. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/10027/22592.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Yavolovsky A. Decision-Theoretic Monitoring of Cyber-Physical Systems. [Thesis]. University of Illinois – Chicago; 2018. Available from: http://hdl.handle.net/10027/22592
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Rice University
7.
Grady, Devin.
Motion Planning with Uncertain Information in Robotic Tasks.
Degree: PhD, Engineering, 2014, Rice University
URL: http://hdl.handle.net/1911/76728
► In the real world, robots operate with imperfect sensors providing uncertain and incomplete information. We develop techniques to solve motion planning problems with imperfect information…
(more)
▼ In the real world, robots operate with imperfect sensors providing uncertain and incomplete information.
We develop techniques to solve motion planning problems with imperfect information in order to accomplish a variety of robotic tasks including navigation, search-and-rescue, and exposure minimization.
This thesis focuses on the challenge of creating robust policies for robots with imperfect actions and sensing.
These policies map input observations to output actions.
The tools that exist to solve these problems are typically Partially-Observable Markov Decision Processes (POMDPs), and can only handle small problem instances.
This thesis proposes several techniques to expand the size of the problem instance that can be considered.
Because executing a policy is simple once the offline computation is done, even inexpensive, computationally constrained robots can use these policies and solve the tasks mentioned.
First we show that the solution of an abstracted action space can be used to bootstrap a complete solution for navigation.
Generalizing this action space abstraction to both action and state spaces expands the set of problems that can be solved.
Additionally, the concept of abstraction is applied to the workspace – we develop a method to compute local solutions to a noisy navigation problem, then stitch them together into a global solution.
Our proposed methods are run on large problem instances, and the output policies are compared against policies generated with existing techniques.
Though these large tasks are often unsolvable with previous methods, abstraction allows us to find high quality policies.
Our findings show that these techniques significantly increase the size of tasks involving planning with uncertain information for which solutions can be found.
The techniques presented generally offer significant speed increases and often solution quality improvements as well.
Additionally, this thesis includes work on two separate problems.
First, we solve a task where several robots cooperate to quickly classify an observed object as one of several possible types using a camera.
Then, we proceed to solve a task where a single robot navigates to a destination quickly, but the robot may need to allocate time towards obtaining information about a new object discovered along the way.
Advisors/Committee Members: Kavraki, Lydia E. (advisor), McLurkin, James (committee member), Moll, Mark (committee member), O'Malley, Marcia K. (committee member).
Subjects/Keywords: Robotics; Motion planning; POMDP; Sensory noise
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Grady, D. (2014). Motion Planning with Uncertain Information in Robotic Tasks. (Doctoral Dissertation). Rice University. Retrieved from http://hdl.handle.net/1911/76728
Chicago Manual of Style (16th Edition):
Grady, Devin. “Motion Planning with Uncertain Information in Robotic Tasks.” 2014. Doctoral Dissertation, Rice University. Accessed March 01, 2021.
http://hdl.handle.net/1911/76728.
MLA Handbook (7th Edition):
Grady, Devin. “Motion Planning with Uncertain Information in Robotic Tasks.” 2014. Web. 01 Mar 2021.
Vancouver:
Grady D. Motion Planning with Uncertain Information in Robotic Tasks. [Internet] [Doctoral dissertation]. Rice University; 2014. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/1911/76728.
Council of Science Editors:
Grady D. Motion Planning with Uncertain Information in Robotic Tasks. [Doctoral Dissertation]. Rice University; 2014. Available from: http://hdl.handle.net/1911/76728

Delft University of Technology
8.
Zhou, Bingyu (author).
Probabilistic Motion Planning in Uncertain and Dynamic Environments.
Degree: 2017, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:f491f7d8-a2f5-4f89-b4b7-86ac6b64546b
► Autonomous driving is one of the popular and advanced research fields aiming to reduce the mortality rate and improve the welfare and efficiency of commuters'…
(more)
▼ Autonomous driving is one of the popular and advanced research fields aiming to reduce the mortality rate and improve the welfare and efficiency of commuters' lives. As one of the important research branches of self-driving cars, this thesis focuses on the motion planning problems in dynamic and uncertain environments. The challenge of motion planning in uncertain and dynamic environments is that the ego-vehicle or the robot needs to intelligently reason about the interactions between itself and the other traffic participants. In order to successfully make decisions to guarantee the safety and users' comfort, it requires the motion planner to account for the future motions and the uncertainty of motion intentions associated with the obstacles to avoid the ego-vehicle ending up with inevitable collision states. This thesis presents three approaches to solve the challenge from manifold perspectives. The first approach, multipolicy MPC, introduces the uncertainty of motion intentions into the traditional optimization framework by modeling the multiple hypotheses of future trajectories as mixture Gaussian distributions. The second approach, centralized MPC, computes the motion plans for all vehicles including the obstacle vehicles by assuming the obstacles are controllable and broadcast their motion intentions. The last approach called joint behavior estimation and planning is a novel algorithm, which takes the interactions into account. It leverages the strengths of online POMDP to model the interactions through anticipating the future trajectories of obstacles under different motion intentions. The predicted trajectories are then utilized in the multipolicy MPC motion planner to compute the optimal actions for the ego-vehicle. A parallelized structure of joint behavior estimation and planning is also presented to scale up in cluttered environments. Simulation results demonstrate the benefits of the proposed approaches, particularly the joint behavior estimation and planning, in uncertain and interactive environments.
Systems and Control
Advisors/Committee Members: Alonso Mora, Javier (mentor), Delft University of Technology (degree granting institution).
Subjects/Keywords: Motion planning; Trajectory generation; Interaction; MPC; POMDP
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhou, B. (. (2017). Probabilistic Motion Planning in Uncertain and Dynamic Environments. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:f491f7d8-a2f5-4f89-b4b7-86ac6b64546b
Chicago Manual of Style (16th Edition):
Zhou, Bingyu (author). “Probabilistic Motion Planning in Uncertain and Dynamic Environments.” 2017. Masters Thesis, Delft University of Technology. Accessed March 01, 2021.
http://resolver.tudelft.nl/uuid:f491f7d8-a2f5-4f89-b4b7-86ac6b64546b.
MLA Handbook (7th Edition):
Zhou, Bingyu (author). “Probabilistic Motion Planning in Uncertain and Dynamic Environments.” 2017. Web. 01 Mar 2021.
Vancouver:
Zhou B(. Probabilistic Motion Planning in Uncertain and Dynamic Environments. [Internet] [Masters thesis]. Delft University of Technology; 2017. [cited 2021 Mar 01].
Available from: http://resolver.tudelft.nl/uuid:f491f7d8-a2f5-4f89-b4b7-86ac6b64546b.
Council of Science Editors:
Zhou B(. Probabilistic Motion Planning in Uncertain and Dynamic Environments. [Masters Thesis]. Delft University of Technology; 2017. Available from: http://resolver.tudelft.nl/uuid:f491f7d8-a2f5-4f89-b4b7-86ac6b64546b

Delft University of Technology
9.
Vroom, Quinn (author).
POMDP based online parameter estimation for autonomous passenger vehicles: Researching online tyre parameter estimation performance by improving the trajectory using a POMDP algorithm.
Degree: 2019, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:192fdf22-0c21-49f4-9413-211276871008
► The internal model is an important piece of the control system of an autonomous driving vehicle. In order for the model to deliver accurate predictions,…
(more)
▼ The internal model is an important piece of the control system of an autonomous driving vehicle. In order for the model to deliver accurate predictions, a valid model structure and well chosen parameters are needed. Model parameters can be highly fluctuating or complex to predict, especially when looking into tyre ground surface interaction models. Instead of predicting parameter values beforehand, they could be estimated and updated in real-time. Fluctuation or incorrectness can be adjusted while driving. However, this uncertainty in parameter value must be accounted for when applying control. Solving this problem by regarding the uncertainty in parameters of the internal vehicle model as a
POMDP has been researched in this paper. The research question being: is it worthwhile to use the
POMDP approach for online parameter estimation of autonomous passenger vehicles? To answer this multiple sub-questions have been composed. We start off looking into: what is the most suitable vehicle model? Different vehicle models and tyre models were compared. Literature showed the bicycle model in combination with the linearized tyre model to be most suitable for autonomous passenger vehicles. The next question is: What is the most promising algorithm? Using literature, suitable algorithms for solving this
POMDP have been found and compared. From three compelling algorithms, the one best fitting the autonomous driving criteria was chosen. Knowing the model and the algorithm for the simulation the next question became: Does the algorithm perform on a vehicle model? To answer this question, the simulation has been implemented in MATLAB and performance has been tested. The results showed significant increase in parameter estimation performance. Within 2 timesteps the estimate had converged correctly. The next question is: Does the algorithm perform within realistic bounds? To answer this question, the same simulation as before has been used, but now with saturation on the steering input. This showed parameter estimation performance increase compared to the original trajectory, but not as overwhelming as without saturation. The next question is: Does the algorithm suffer from high noise? To answer this question, the same simulation has been used, but now with different levels of noise. The results showed parameter estimation performance significantly affected by increasing noise. The final sub-question is: Does the algorithm suit increasing model complexity? To answer this question, the amount of parameters have been increased in the simulation and there has been looked into the large matrices that accompany the algorithm. Results showed that increasing the complexity has a significant effect on the size of the simulation and algorithm matrices. In conclusion, from all of these experiments arose some very interesting results. This produced a useful insight into the strengths and weaknesses of the
POMDP algorithm performing on a passenger vehicle, answering the research question. This also led to various recommendations for future…
Advisors/Committee Members: Wisse, Martijn (mentor), Spaan, Matthijs (graduation committee), Zhou, Hongpeng (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: POMDP; parameter estimation; online; tyre model
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Vroom, Q. (. (2019). POMDP based online parameter estimation for autonomous passenger vehicles: Researching online tyre parameter estimation performance by improving the trajectory using a POMDP algorithm. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:192fdf22-0c21-49f4-9413-211276871008
Chicago Manual of Style (16th Edition):
Vroom, Quinn (author). “POMDP based online parameter estimation for autonomous passenger vehicles: Researching online tyre parameter estimation performance by improving the trajectory using a POMDP algorithm.” 2019. Masters Thesis, Delft University of Technology. Accessed March 01, 2021.
http://resolver.tudelft.nl/uuid:192fdf22-0c21-49f4-9413-211276871008.
MLA Handbook (7th Edition):
Vroom, Quinn (author). “POMDP based online parameter estimation for autonomous passenger vehicles: Researching online tyre parameter estimation performance by improving the trajectory using a POMDP algorithm.” 2019. Web. 01 Mar 2021.
Vancouver:
Vroom Q(. POMDP based online parameter estimation for autonomous passenger vehicles: Researching online tyre parameter estimation performance by improving the trajectory using a POMDP algorithm. [Internet] [Masters thesis]. Delft University of Technology; 2019. [cited 2021 Mar 01].
Available from: http://resolver.tudelft.nl/uuid:192fdf22-0c21-49f4-9413-211276871008.
Council of Science Editors:
Vroom Q(. POMDP based online parameter estimation for autonomous passenger vehicles: Researching online tyre parameter estimation performance by improving the trajectory using a POMDP algorithm. [Masters Thesis]. Delft University of Technology; 2019. Available from: http://resolver.tudelft.nl/uuid:192fdf22-0c21-49f4-9413-211276871008

University of Sydney
10.
Marchant Matus, Roman.
Bayesian Optimisation for Planning in Dynamic Environments
.
Degree: 2015, University of Sydney
URL: http://hdl.handle.net/2123/14497
► This thesis addresses the problem of trajectory planning for monitoring extreme values of an environmental phenomenon that changes in space and time. The most relevant…
(more)
▼ This thesis addresses the problem of trajectory planning for monitoring extreme values of an environmental phenomenon that changes in space and time. The most relevant case study corresponds to environmental monitoring using an autonomous mobile robot for air, water and land pollution monitoring. Since the dynamics of the phenomenon are initially unknown, the planning algorithm needs to satisfy two objectives simultaneously: 1) Learn and predict spatial-temporal patterns and, 2) find areas of interest (e.g. high pollution), addressing the exploration-exploitation trade-off. Consequently, the thesis brings the following contributions: Firstly, it applies and formulates Bayesian Optimisation (BO) to planning in robotics. By maintaining a Gaussian Process (GP) model of the environmental phenomenon the planning algorithms are able to learn the spatial and temporal patterns. A new family of acquisition functions which consider the position of the robot is proposed, allowing an efficient trajectory planning. Secondly, BO is generalised for optimisation over continuous paths, not only determining where and when to sample, but also how to get there. Under these new circumstances, the optimisation of the acquisition function for each iteration of the BO algorithm becomes costly, thus a second layer of BO is included in order to effectively reduce the number of iterations. Finally, this thesis presents Sequential Bayesian Optimisation (SBO), which is a generalisation of the plain BO algorithm with the goal of achieving non-myopic trajectory planning. SBO is formulated under a Partially Observable Markov Decision Process (POMDP) framework, which can find the optimal decision for a sequence of actions with their respective outcomes. An online solution of the POMDP based on Monte Carlo Tree Search (MCTS) allows an efficient search of the optimal action for multistep lookahead. The proposed planning algorithms are evaluated under different scenarios. Experiments on large scale ozone pollution monitoring and indoor light intensity monitoring are conducted for simulated and real robots. The results show the advantages of planning over continuous paths and also demonstrate the benefit of deeper search strategies using SBO.
Subjects/Keywords: Bayesian;
Optimisation;
Planning;
Robotics;
POMDP;
Gaussian-Process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Marchant Matus, R. (2015). Bayesian Optimisation for Planning in Dynamic Environments
. (Thesis). University of Sydney. Retrieved from http://hdl.handle.net/2123/14497
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Marchant Matus, Roman. “Bayesian Optimisation for Planning in Dynamic Environments
.” 2015. Thesis, University of Sydney. Accessed March 01, 2021.
http://hdl.handle.net/2123/14497.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Marchant Matus, Roman. “Bayesian Optimisation for Planning in Dynamic Environments
.” 2015. Web. 01 Mar 2021.
Vancouver:
Marchant Matus R. Bayesian Optimisation for Planning in Dynamic Environments
. [Internet] [Thesis]. University of Sydney; 2015. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/2123/14497.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Marchant Matus R. Bayesian Optimisation for Planning in Dynamic Environments
. [Thesis]. University of Sydney; 2015. Available from: http://hdl.handle.net/2123/14497
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Sydney
11.
Morere, Philippe.
Bayesian Optimisation for Planning And Reinforcement Learning
.
Degree: 2019, University of Sydney
URL: http://hdl.handle.net/2123/21230
► This thesis addresses the problem of achieving efficient non-myopic decision making by explicitly balancing exploration and exploitation. Decision making, both in planning and reinforcement learning…
(more)
▼ This thesis addresses the problem of achieving efficient non-myopic decision making by explicitly balancing exploration and exploitation. Decision making, both in planning and reinforcement learning (RL), enables agents or robots to complete tasks by acting on their environments. Complexity arises when completing objectives requires sacrificing short-term performance in order to achieve better long-term performance. Decision making algorithms with this characteristic are known as non-myopic, and require long sequences of actions to be evaluated, thereby greatly increasing the search space size. Optimal behaviours need balance two key quantities: exploration and exploitation. Exploitation takes advantage of previously acquired information or high performing solutions, whereas exploration focuses on acquiring more informative data. The balance between these quantities is crucial in both RL and planning. This thesis brings the following contributions: Firstly, a reward function trading off exploration and exploitation of gradients for sequential planning is proposed. It is based on Bayesian optimisation (BO) and is combined to a non-myopic planner to achieve efficient spatial monitoring. Secondly, the algorithm is extended to continuous actions spaces, called continuous belief tree search (CBTS), and uses BO to dynamically sample actions within a tree search, balancing high-performing actions and novelty. Finally, the framework is extended to RL, for which a multi-objective methodology for explicit exploration and exploitation balance is proposed. The two objectives are modelled explicitly and balanced at a policy level, as in BO. This allows for online exploration strategies, as well as a data-efficient model-free RL algorithm achieving exploration by minimising the uncertainty of Q-values (EMU-Q). The proposed algorithms are evaluated on different simulated and real-world robotics problems, displaying superior performance in terms of sample efficiency and exploration.
Subjects/Keywords: Reinforcement Learning;
Exploration;
Planning;
POMDP;
Bayesian;
Uncertainity
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Morere, P. (2019). Bayesian Optimisation for Planning And Reinforcement Learning
. (Thesis). University of Sydney. Retrieved from http://hdl.handle.net/2123/21230
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Morere, Philippe. “Bayesian Optimisation for Planning And Reinforcement Learning
.” 2019. Thesis, University of Sydney. Accessed March 01, 2021.
http://hdl.handle.net/2123/21230.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Morere, Philippe. “Bayesian Optimisation for Planning And Reinforcement Learning
.” 2019. Web. 01 Mar 2021.
Vancouver:
Morere P. Bayesian Optimisation for Planning And Reinforcement Learning
. [Internet] [Thesis]. University of Sydney; 2019. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/2123/21230.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Morere P. Bayesian Optimisation for Planning And Reinforcement Learning
. [Thesis]. University of Sydney; 2019. Available from: http://hdl.handle.net/2123/21230
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Delft University of Technology
12.
Weltevrede, Max (author).
Planning for Money Laundering Investigations.
Degree: 2020, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:24babee2-0144-4d66-aadb-7cab83f5566c
► According to the United Nations, the amount of money laundered worldwide each year is an estimated 2 - 5% of global GDP (equivalent to 800…
(more)
▼ According to the United Nations, the amount of money laundered worldwide each year is an estimated 2 - 5% of global GDP (equivalent to 800 billion to 2 trillion in US dollars). This is money that criminal enterprises rely on to oper- ate. For that reason, the European Union demands that gatekeepers (banks and other obliged entities) apply measures to counteract money laundering. Current industry state of the art anti-money laundering (AML) techniques ultimately revolve around investigations by specialized financial investigators of suspicious behaviour. Due to the human nature of this work, this process is relatively slow and has limited capacity. Deciding in the most optimal way what financial en- tities to investigate and when is not a trivial problem. However, optimizing this sequential decision making problem could significantly decrease the time-scale in which fraudulent actors are caught. This thesis will formulate the AML problem as a Partially Observable Markov Decision Problem. It will design and imple- ment an AML model and investigate the challenges associated with optimizing it. In particular, several Partially Observable Monte-Carlo Planning based methods are proposed that exploit the combinatorial structure of the actions to overcome the challenges associated with a large action space. The methods are empirically evaluated on the AML problem and compared to a baseline solution. The results indicate that exploiting the combinatorial structure increases the performance in this problem scenario. However, it seems that exploiting the structure to the highest degree does not always lead to the best performance. Additionally, we show that the proposed methods can match or even outperform the upper bound set by the baseline solution.
Advisors/Committee Members: Spaan, M.T.J. (mentor), Oliehoek, F.A. (graduation committee), Alos Palop, Mireia (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: Planning; Anti-Money Laundering; partially observable; POMDP
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Weltevrede, M. (. (2020). Planning for Money Laundering Investigations. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:24babee2-0144-4d66-aadb-7cab83f5566c
Chicago Manual of Style (16th Edition):
Weltevrede, Max (author). “Planning for Money Laundering Investigations.” 2020. Masters Thesis, Delft University of Technology. Accessed March 01, 2021.
http://resolver.tudelft.nl/uuid:24babee2-0144-4d66-aadb-7cab83f5566c.
MLA Handbook (7th Edition):
Weltevrede, Max (author). “Planning for Money Laundering Investigations.” 2020. Web. 01 Mar 2021.
Vancouver:
Weltevrede M(. Planning for Money Laundering Investigations. [Internet] [Masters thesis]. Delft University of Technology; 2020. [cited 2021 Mar 01].
Available from: http://resolver.tudelft.nl/uuid:24babee2-0144-4d66-aadb-7cab83f5566c.
Council of Science Editors:
Weltevrede M(. Planning for Money Laundering Investigations. [Masters Thesis]. Delft University of Technology; 2020. Available from: http://resolver.tudelft.nl/uuid:24babee2-0144-4d66-aadb-7cab83f5566c

Colorado State University
13.
Ragi, Shankarachary.
Cooperative control of mobile sensor platforms in dynamic environments.
Degree: PhD, Electrical and Computer Engineering, 2014, Colorado State University
URL: http://hdl.handle.net/10217/82529
► We develop guidance algorithms to control mobile sensor platforms, for both centralized and decentralized settings, in dynamic environments for various applications. More precisely, we develop…
(more)
▼ We develop guidance algorithms to control mobile sensor platforms, for both centralized and decentralized settings, in dynamic environments for various applications. More precisely, we develop control algorithms for the following mobile sensor platforms: unmanned aerial vehicles (UAVs) with on-board sensors for multitarget tracking, autonomous amphibious vehicles for flood-rescue operations, and directional sensors (e.g., surveillance cameras) for maximizing an information-gain-based objective function. The following is a brief description of each of the above-mentioned guidance control algorithms. We develop both centralized and decentralized control algorithms for UAVs based on the theories of partially observable Markov decision process (
POMDP) and decentralized
POMDP (Dec-
POMDP) respectively. Both POMDPs and Dec-POMDPs are intractable to solve exactly; therefore we adopt an approximation method called nominal belief-state optimization (NBO) to solve (approximately) the control problems posed as a
POMDP or a Dec-
POMDP. We then address an amphibious vehicle guidance problem for a flood rescue application. Here, the goal is to control multiple autonomous amphibious vehicles while minimizing the average rescue time of multiple human targets stranded in a flood situation. We again pose this problem as a
POMDP, and extend the above-mentioned NBO approximation method to solve the guidance problem. In the final phase, we study the problem of controlling multiple 2-D directional sensors while maximizing an objective function based on the information gain corresponding to multiple target locations. This problem is found to be a combinatorial optimization problem, so we develop heuristic methods to solve the problem approximately, and provide analytical results on performance guarantees. We then improve the performance of our heuristics by applying an approximate dynamic programming approach called rollout.
Advisors/Committee Members: Chong, Edwin K. P. (advisor), Krapf, Diego (committee member), Luo, J. Rockey (committee member), Oprea, Juliana (committee member).
Subjects/Keywords: sensor fusion; path planning for autonomous vehicles; target tracking; applications of POMDP and Dec-POMDP; decision making under uncertainty
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ragi, S. (2014). Cooperative control of mobile sensor platforms in dynamic environments. (Doctoral Dissertation). Colorado State University. Retrieved from http://hdl.handle.net/10217/82529
Chicago Manual of Style (16th Edition):
Ragi, Shankarachary. “Cooperative control of mobile sensor platforms in dynamic environments.” 2014. Doctoral Dissertation, Colorado State University. Accessed March 01, 2021.
http://hdl.handle.net/10217/82529.
MLA Handbook (7th Edition):
Ragi, Shankarachary. “Cooperative control of mobile sensor platforms in dynamic environments.” 2014. Web. 01 Mar 2021.
Vancouver:
Ragi S. Cooperative control of mobile sensor platforms in dynamic environments. [Internet] [Doctoral dissertation]. Colorado State University; 2014. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/10217/82529.
Council of Science Editors:
Ragi S. Cooperative control of mobile sensor platforms in dynamic environments. [Doctoral Dissertation]. Colorado State University; 2014. Available from: http://hdl.handle.net/10217/82529

Duke University
14.
Liu, Miao.
Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments
.
Degree: 2014, Duke University
URL: http://hdl.handle.net/10161/9043
► As a growing number of agents are deployed in complex environments for scientific research and human well-being, there are increasing demands for designing efficient…
(more)
▼ As a growing number of agents are deployed in complex environments for scientific research and human well-being, there are increasing demands for designing efficient learning algorithms for these agents to improve their control polices. Such policies must account for uncertainties, including those caused by environmental stochasticity, sensor noise and communication restrictions. These challenges exist in missions such as planetary navigation, forest firefighting, and underwater exploration. Ideally, good control policies should allow the agents to deal with all the situations in an environment and enable them to accomplish their mission within the budgeted time and resources. However, a correct model of the environment is not typically available in advance, requiring the policy to be learned from data. Model-free reinforcement learning (RL) is a promising candidate for agents to learn control policies while engaged in complex tasks, because it allows the control policies to be learned directly from a subset of experiences and with time efficiency. Moreover, to ensure persistent performance improvement for RL, it is important that the control policies be concisely represented based on existing knowledge, and have the flexibility to accommodate new experience. Bayesian nonparametric methods (BNPMs) both allow the complexity of models to be adaptive to data, and provide a principled way for discovering and representing new knowledge. In this thesis, we investigate approaches for RL in centralized and decentralized sequential decision-making problems using BNPMs. We show how the control policies can be learned efficiently under model-free RL schemes with BNPMs. Specifically, for centralized sequential decision-making, we study Q learning with Gaussian processes to solve Markov decision processes, and we also employ hierarchical Dirichlet processes as the prior for the control policy parameters to solve partially observable Markov decision processes. For decentralized partially observable Markov decision processes, we use stick-breaking processes as the prior for the controller of each agent. We develop efficient inference algorithms for learning the corresponding control policies. We demonstrate that by combining model-free RL and BNPMs with efficient algorithm design, we are able to scale up RL methods for complex problems that cannot be solved due to the lack of model knowledge. We adaptively learn control policies with concise structure and high value, from a relatively small amount of data.
Advisors/Committee Members: Carin, Lawrence (advisor).
Subjects/Keywords: Electrical engineering;
Computer engineering;
Bayeisan nonparametric methods;
Decentralized POMDP;
Finite state controller;
Gaussian process;
POMDP;
reinforcement learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Liu, M. (2014). Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments
. (Thesis). Duke University. Retrieved from http://hdl.handle.net/10161/9043
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Liu, Miao. “Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments
.” 2014. Thesis, Duke University. Accessed March 01, 2021.
http://hdl.handle.net/10161/9043.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Liu, Miao. “Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments
.” 2014. Web. 01 Mar 2021.
Vancouver:
Liu M. Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments
. [Internet] [Thesis]. Duke University; 2014. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/10161/9043.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Liu M. Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments
. [Thesis]. Duke University; 2014. Available from: http://hdl.handle.net/10161/9043
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Mississippi State University
15.
Shi, Jinchuan.
A framework for integrating influence diagrams and POMDPs.
Degree: PhD, Computer Science and Engineering, 2018, Mississippi State University
URL: http://sun.library.msstate.edu/ETD-db/theses/available/etd-03022018-153923/
;
► An influence diagram is a widely-used graphical model for representing and solving problems of sequential decision making under imperfect information. A closely-related model for the…
(more)
▼ An influence diagram is a widely-used graphical model for representing and solving problems of sequential decision making under imperfect information. A closely-related model for the same class of problems is a partially observable Markov decision process (
POMDP). This dissertation leverages the relationship between these two models to develop improved algorithms for solving influence diagrams.
The primary contribution is to generalize two classic dynamic programming algorithms for solving influence diagrams, Arc Reversal and Variable Elimination, by integrating them with a dynamic programming technique originally developed for solving POMDPs. This generalization relaxes constraints on the ordering of the steps of these algorithms in a way that dramatically improves scalability, especially in solving complex, multi-stage decision problems.
A secondary contribution is the adoption of a more compact and intuitive representation of the solution of an influence diagram, called a strategy. Instead of representing a
strategy as a table or as a tree, a strategy is represented as an acyclic graph, which can be exponentially more compact, making the strategy easier to interpret and understand.
Advisors/Committee Members: Ioana Banicescu (committee member), J. Edward Swan II (committee member), Maxwell Young (committee member), Eric Hansen (chair).
Subjects/Keywords: POMDP; Graphical Model; Probabilistic Inference; Theoretical Decision Planning; Influence Diagram
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Shi, J. (2018). A framework for integrating influence diagrams and POMDPs. (Doctoral Dissertation). Mississippi State University. Retrieved from http://sun.library.msstate.edu/ETD-db/theses/available/etd-03022018-153923/ ;
Chicago Manual of Style (16th Edition):
Shi, Jinchuan. “A framework for integrating influence diagrams and POMDPs.” 2018. Doctoral Dissertation, Mississippi State University. Accessed March 01, 2021.
http://sun.library.msstate.edu/ETD-db/theses/available/etd-03022018-153923/ ;.
MLA Handbook (7th Edition):
Shi, Jinchuan. “A framework for integrating influence diagrams and POMDPs.” 2018. Web. 01 Mar 2021.
Vancouver:
Shi J. A framework for integrating influence diagrams and POMDPs. [Internet] [Doctoral dissertation]. Mississippi State University; 2018. [cited 2021 Mar 01].
Available from: http://sun.library.msstate.edu/ETD-db/theses/available/etd-03022018-153923/ ;.
Council of Science Editors:
Shi J. A framework for integrating influence diagrams and POMDPs. [Doctoral Dissertation]. Mississippi State University; 2018. Available from: http://sun.library.msstate.edu/ETD-db/theses/available/etd-03022018-153923/ ;

University of Georgia
16.
Perez Barrenechea, Dennis David.
Anytime point based approximations for interactive POMDPs.
Degree: 2014, University of Georgia
URL: http://hdl.handle.net/10724/24464
► Partially observable Markov decision processes (POMDPs) have been largely accepted as a rich-framework for planning and control problems. In settings where multiple agents interact, POMDPs…
(more)
▼ Partially observable Markov decision processes (POMDPs) have been largely accepted as a rich-framework for planning and control problems. In settings where multiple agents interact, POMDPs fail to model other agents explicitly. The
interactive partially observable Markov decision process (I-POMDP) is a new paradigm that extends POMDPs to multiagent settings. The I-POMDP framework models other agents explicitly, making exact solution unfeasible but for the simplest settings. Thus, a
need for good approximation methods arises, methods that could find solutions with tight error bounds and short periods of time. We develop a point based method for solving finitely nested I-POMDPs pproximately. The method maintains a set of belief
points and form value functions including only the value vectors that are optimal at these belief points. Since I-POMDPs computation depends on the prediction of the actions of other agents in multiagent settings, an interactive generalization of the
point based value iteration (PBVI) methods that recursively solves all models of other agents needed to be developed. We present some empirical results in domains on the literature and discuss the computational savings of the proposed
method.
Subjects/Keywords: Markov Decision Process; Multiagent systems; Decision making; POMDP
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Perez Barrenechea, D. D. (2014). Anytime point based approximations for interactive POMDPs. (Thesis). University of Georgia. Retrieved from http://hdl.handle.net/10724/24464
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Perez Barrenechea, Dennis David. “Anytime point based approximations for interactive POMDPs.” 2014. Thesis, University of Georgia. Accessed March 01, 2021.
http://hdl.handle.net/10724/24464.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Perez Barrenechea, Dennis David. “Anytime point based approximations for interactive POMDPs.” 2014. Web. 01 Mar 2021.
Vancouver:
Perez Barrenechea DD. Anytime point based approximations for interactive POMDPs. [Internet] [Thesis]. University of Georgia; 2014. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/10724/24464.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Perez Barrenechea DD. Anytime point based approximations for interactive POMDPs. [Thesis]. University of Georgia; 2014. Available from: http://hdl.handle.net/10724/24464
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Texas A&M University
17.
Halbert, Tyler Raymond.
An Improved Algorithm for Sequential Information-Gathering Decisions in Design under Uncertainty.
Degree: MS, Mechanical Engineering, 2015, Texas A&M University
URL: http://hdl.handle.net/1969.1/155384
► In engineering decision making, particularly in design, engineers must make decisions under varying levels of uncertainty. While not always the case, oftentimes one of the…
(more)
▼ In engineering decision making, particularly in design, engineers must make decisions under varying levels of uncertainty. While not always the case, oftentimes one of the options available to an engineer is the ability to gather information that will reduce the uncertainty. With the reduced uncertainty, the engineer then returns to the same decision with more information. This sequential information-gathering decision problem is difficult to analyze and solve because the engineer must predict the value of gathering information in order to determine if the value outweighs the cost of the resources expended to gather the information. In practice, heuristics, intuition, and deadlines are often used to decide whether or not to gather information. A more complete and formal approach for quantifying the value of gathering information would benefit engineers in design decision making.
Recent work proposed that a Partially Observable Markov Decision Process (
POMDP) is an appropriate formalism for modeling sequential information-gathering decisions. A
POMDP appears capable of capturing the salient features of such decisions. However, existing
POMDP solution algorithms scale poorly with problem size. This thesis introduces an improved algorithm for solving POMDPs that takes advantage of certain characteristics inherent to information-gathering decision problems. The new algorithm is orders of magnitude faster and also is capable of handling specific problem parameters that existing methods cannot. The improvement is shown with a detailed case study, where the case study also performs a comparison of using the
POMDP formalism for solving information-gathering decision problems to widely known approximate methods, such as Expected Value of Information methods. The study demonstrates that the use of the
POMDP formalism, along with the improved algorithm, provides a valuable method for solving certain information-gathering decision problems.
Advisors/Committee Members: Malak, Richard J (advisor), Hartl, Darren J (advisor), Allaire, Douglas (committee member).
Subjects/Keywords: Design under Uncertainty; Information-Gathering; Sequential Decisions; POMDP
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Halbert, T. R. (2015). An Improved Algorithm for Sequential Information-Gathering Decisions in Design under Uncertainty. (Masters Thesis). Texas A&M University. Retrieved from http://hdl.handle.net/1969.1/155384
Chicago Manual of Style (16th Edition):
Halbert, Tyler Raymond. “An Improved Algorithm for Sequential Information-Gathering Decisions in Design under Uncertainty.” 2015. Masters Thesis, Texas A&M University. Accessed March 01, 2021.
http://hdl.handle.net/1969.1/155384.
MLA Handbook (7th Edition):
Halbert, Tyler Raymond. “An Improved Algorithm for Sequential Information-Gathering Decisions in Design under Uncertainty.” 2015. Web. 01 Mar 2021.
Vancouver:
Halbert TR. An Improved Algorithm for Sequential Information-Gathering Decisions in Design under Uncertainty. [Internet] [Masters thesis]. Texas A&M University; 2015. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/1969.1/155384.
Council of Science Editors:
Halbert TR. An Improved Algorithm for Sequential Information-Gathering Decisions in Design under Uncertainty. [Masters Thesis]. Texas A&M University; 2015. Available from: http://hdl.handle.net/1969.1/155384

University of Illinois – Chicago
18.
Han, Yanlin.
Symbolic and Neural Approaches for Learning Other Agents’ Intentional Models using Interactive POMDPs.
Degree: 2018, University of Illinois – Chicago
URL: http://hdl.handle.net/10027/23264
► Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. I-POMDPs augment…
(more)
▼ Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. I-POMDPs augment
POMDP beliefs with nested hierarchical belief structures. In order to plan optimally using I-POMDPs, we propose symbolic and neural approaches that learn others’ intentional models which ascribe to them beliefs, preferences and rationality in action selection.
In the symbolic Bayesian approach, agents maintain beliefs over intentional models of other agents and make sequential Bayesian updates using observations. To deal with the complexity of the hierarchical belief space, we have devised a customized interactive particle filter (I-PF) to descend the belief hierarchy, parametrize others' models, and sample all model parameters at each nesting level. We have also devised a neural network approximation of the I-
POMDP framework, in which the belief update, value function, and policy function are implemented by various neural networks (NNs). Then we combined the same network architecture with the QMDP planner, and trained it end-to-end in a reinforcement learning fashion.
Empirical results show that our Bayesian learning approach accurately learns models of the other agent. It serves as a generalized Bayesian learning algorithm that learns other agents' beliefs, nesting levels, and transition, observation and reward functions. Moreover, we show that the model-based network which learns to plan outperforms the model-free network which only learns reactive policies. The learned policy can directly generalize to a larger, unseen setting.
Advisors/Committee Members: Gmytrasiewicz, Piotr (advisor), Liu, Bing (committee member), Ziebart, Brian (committee member), Zhang, Xinhua (committee member), Koyuncu, Erdem (committee member), Gmytrasiewicz, Piotr (chair).
Subjects/Keywords: I-POMDP; sampling; multi-agent systems; deep reinforcement learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Han, Y. (2018). Symbolic and Neural Approaches for Learning Other Agents’ Intentional Models using Interactive POMDPs. (Thesis). University of Illinois – Chicago. Retrieved from http://hdl.handle.net/10027/23264
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Han, Yanlin. “Symbolic and Neural Approaches for Learning Other Agents’ Intentional Models using Interactive POMDPs.” 2018. Thesis, University of Illinois – Chicago. Accessed March 01, 2021.
http://hdl.handle.net/10027/23264.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Han, Yanlin. “Symbolic and Neural Approaches for Learning Other Agents’ Intentional Models using Interactive POMDPs.” 2018. Web. 01 Mar 2021.
Vancouver:
Han Y. Symbolic and Neural Approaches for Learning Other Agents’ Intentional Models using Interactive POMDPs. [Internet] [Thesis]. University of Illinois – Chicago; 2018. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/10027/23264.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Han Y. Symbolic and Neural Approaches for Learning Other Agents’ Intentional Models using Interactive POMDPs. [Thesis]. University of Illinois – Chicago; 2018. Available from: http://hdl.handle.net/10027/23264
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
19.
COWDEROY, GRACE.
Partitioning POMDPs for multiple input types, and their application to dialogue managers.
Degree: School of Computer Science & Statistics. Discipline of Computer Science, 2018, Trinity College Dublin
URL: http://hdl.handle.net/2262/85443
► This research looks at the POMDP models used for dialogue management within Spoken Dialogue Systems (SDS). In particular, it examines the difficulty of handling multimodal…
(more)
▼ This research looks at the
POMDP models used for dialogue management within Spoken Dialogue
Systems (SDS). In particular, it examines the difficulty of handling multimodal inputs. It proposes a
generalisation of the
POMDP model in order to tackle this.
This model is shown, via a Car Advisory system, to improve tractability for multimodal inputs.
Advisors/Committee Members: VOGEL, CARL.
Subjects/Keywords: partitioning POMDPs; car advisory system; POMDP; dialogue manager; multimodal input
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
COWDEROY, G. (2018). Partitioning POMDPs for multiple input types, and their application to dialogue managers. (Thesis). Trinity College Dublin. Retrieved from http://hdl.handle.net/2262/85443
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
COWDEROY, GRACE. “Partitioning POMDPs for multiple input types, and their application to dialogue managers.” 2018. Thesis, Trinity College Dublin. Accessed March 01, 2021.
http://hdl.handle.net/2262/85443.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
COWDEROY, GRACE. “Partitioning POMDPs for multiple input types, and their application to dialogue managers.” 2018. Web. 01 Mar 2021.
Vancouver:
COWDEROY G. Partitioning POMDPs for multiple input types, and their application to dialogue managers. [Internet] [Thesis]. Trinity College Dublin; 2018. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/2262/85443.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
COWDEROY G. Partitioning POMDPs for multiple input types, and their application to dialogue managers. [Thesis]. Trinity College Dublin; 2018. Available from: http://hdl.handle.net/2262/85443
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Delft University of Technology
20.
Hugtenburg, Stefan (author).
Reducing the need for communication in wireless sensor networks using machine learning and planning techniques.
Degree: 2017, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:637213b2-c46a-4a07-b740-21d6c4742b93
► Wireless sensor networks are commonly used to remotely and automatically monitor environments. One of the main challenges in wireless sensor networks is to use the…
(more)
▼ Wireless sensor networks are commonly used to remotely and automatically monitor environments. One of the main challenges in wireless sensor networks is to use the limited available energy as efficiently as possible, to ensure longevity of the network. For such networks to survive their intended deployment period no energy may be wasted on inconsequential actions. As communication is one of the most energy-consuming tasks a sensor mote can perform, we propose a set of techniques that allow a base station to form an accurate environmental state estimation from a selected subset of measurements. In this thesis we present a novel methodology that combines three forms of intelligence. The sensor mote and base station both maintain a neural network-based predictor of the environmental state, which the sensor mote uses as input for different controllers (both handmade and based on Partially Observable Markov Decision Processes) that determine the actions performed by the sensor mote. Armed with the prediction mechanism, a model of the environment, the controller executed by the sensor mote, and the reported measurements, the base station performs computations akin to those commonly used with Hidden Markov Models to form an accurate environmental state estimation. We apply our techniques to real world data sets and reduce the required number of report operations by over 90% whilst incurring only minimal accuracy penalties.
Advisors/Committee Members: de Weerdt, Mathijs (mentor), Spaan, Matthijs (graduation committee), Pawelczak, Przemek (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: wireless sensor networks; Machine Learning; POMDP; neural networks; planning; energy conservation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Hugtenburg, S. (. (2017). Reducing the need for communication in wireless sensor networks using machine learning and planning techniques. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:637213b2-c46a-4a07-b740-21d6c4742b93
Chicago Manual of Style (16th Edition):
Hugtenburg, Stefan (author). “Reducing the need for communication in wireless sensor networks using machine learning and planning techniques.” 2017. Masters Thesis, Delft University of Technology. Accessed March 01, 2021.
http://resolver.tudelft.nl/uuid:637213b2-c46a-4a07-b740-21d6c4742b93.
MLA Handbook (7th Edition):
Hugtenburg, Stefan (author). “Reducing the need for communication in wireless sensor networks using machine learning and planning techniques.” 2017. Web. 01 Mar 2021.
Vancouver:
Hugtenburg S(. Reducing the need for communication in wireless sensor networks using machine learning and planning techniques. [Internet] [Masters thesis]. Delft University of Technology; 2017. [cited 2021 Mar 01].
Available from: http://resolver.tudelft.nl/uuid:637213b2-c46a-4a07-b740-21d6c4742b93.
Council of Science Editors:
Hugtenburg S(. Reducing the need for communication in wireless sensor networks using machine learning and planning techniques. [Masters Thesis]. Delft University of Technology; 2017. Available from: http://resolver.tudelft.nl/uuid:637213b2-c46a-4a07-b740-21d6c4742b93

Delft University of Technology
21.
Chandiramani, Jayesh (author).
Decision Making under Uncertainty for Automated Vehicles in Urban Situations.
Degree: 2017, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:b8339e46-0fde-48bc-a050-ec204c208e20
► One of the most crucial links to building an autonomous system is the task of decision making. The ability of a vehicle to make robust…
(more)
▼ One of the most crucial links to building an autonomous system is the task of decision making. The ability of a vehicle to make robust decisions on its own by predicting and assessing future consequences is what makes it intelligent. This task of decision making becomes even more complex due to the fact that the real world is uncertain, continuous and vehicles interact with each other. Sensors that perceive the real world and measure quantities such as position and speed of other traffic objects are inherently noisy and are further susceptible to external conditions. On the other hand, the road users’ intentions are stochastic and not measurable, and the presence of partial or complete vision based occlusions can make any measurements obtained useless. The decision making unit thus has to be aware of these issues and use the limited knowledge available to anticipate future situations that could unfold in an infinite number of ways to then maximize a reward (or minimize a cost). In this thesis, a method to make automated longitudinal decisions along a predetermined path for autonomous vehicles in unsignalized urban scenarios is proposed. The decision mak- ing problem is formulated as a continuous Partially Observable Markov Decision Process (POMDP) with a discrete Bayesian Network estimating the behavior of other traffic objects. The future evolution of states are predicted using a multi-model Trajectory Estimator along with the digital map of the current driving area. Since continuous spaces make the belief space infinitely large, calculation of the value function for this problem becomes computa- tionally intractable. The presented solver algorithm approximates the value function instead of computing it directly, and additional optimizations reduce the belief space exploration area in order to improve performance. The results show that this single generalized framework is able to handle all the tested urban scenarios with a good safety margin in scenes with multiple traffic objects, even under zero visibility of the other traffic objects due to the presence of occlusions.
Systems and Control
Advisors/Committee Members: Baldi, Simone (mentor), Sefati, Mohsen (mentor), Keviczky, Tamas (graduation committee), Ferranti, Laura (graduation committee), Happee, Riender (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: Autonomous driving; Decision making; Uncertainty; POMDP; Bayesian networks; Occlusion
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chandiramani, J. (. (2017). Decision Making under Uncertainty for Automated Vehicles in Urban Situations. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:b8339e46-0fde-48bc-a050-ec204c208e20
Chicago Manual of Style (16th Edition):
Chandiramani, Jayesh (author). “Decision Making under Uncertainty for Automated Vehicles in Urban Situations.” 2017. Masters Thesis, Delft University of Technology. Accessed March 01, 2021.
http://resolver.tudelft.nl/uuid:b8339e46-0fde-48bc-a050-ec204c208e20.
MLA Handbook (7th Edition):
Chandiramani, Jayesh (author). “Decision Making under Uncertainty for Automated Vehicles in Urban Situations.” 2017. Web. 01 Mar 2021.
Vancouver:
Chandiramani J(. Decision Making under Uncertainty for Automated Vehicles in Urban Situations. [Internet] [Masters thesis]. Delft University of Technology; 2017. [cited 2021 Mar 01].
Available from: http://resolver.tudelft.nl/uuid:b8339e46-0fde-48bc-a050-ec204c208e20.
Council of Science Editors:
Chandiramani J(. Decision Making under Uncertainty for Automated Vehicles in Urban Situations. [Masters Thesis]. Delft University of Technology; 2017. Available from: http://resolver.tudelft.nl/uuid:b8339e46-0fde-48bc-a050-ec204c208e20
22.
Van den Hof, W.D. (author).
Robot Search in Unknown Environments using POMDPs.
Degree: 2014, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:5902cc25-6d45-4ea2-9a8a-4098874e8444
► Search is an important competence for a robot. It is the core task of a search and rescue robot, and many other typical robots task…
(more)
▼ Search is an important competence for a robot. It is the core task of a search and rescue robot, and many other typical robots task require some form of search. In order to be truly autonomous, the robot must be able to perform its tasks in an unknown environment. Using SLAM, the robot has to make search decisions with increasing knowledge of the environment. Today, mostly Next Best View algorithms are used to achieve this feat. These are heuristic based algorithms, which balance between heuristic measures like information gain and traveling time. The action if often chosen greedily. In this thesis a novel approach is taken. POMDPs are used to model the problem of object search. Representing the changing environment explicitly in a POMDP is not feasible, since the number of possible layouts of the environment is just too great. Instead an instance of the POMDP model and the solution are recalculated every time step. Six different POMDP models were designed for known environments and three models for unknown environments. These were tested in a simulated environment and compared to a baseline Next Best View algorithm. Surprisingly, reasoning about unexplored environments was shown not to be necessary for a good result. It has been shown that although NBV works fast, the POMDP clearly gives a better solution, with an average search path that is almost twice as short.
BMD
BioMechanical Engineering
Mechanical, Maritime and Materials Engineering
Advisors/Committee Members: Jonker, P.P. (mentor), Caarls, W. (mentor).
Subjects/Keywords: POMDP Search Robot
…22
3-4-7 b_0: a set of beliefs . . . . . . . . . . . . . . . . . .
3-5 POMDP Models for… …Observations; Local Nodes . .
3-6 POMDP Models For Unknown Environments . . . . . .
3-6-1 Model G… …4-1-2
Experiment 2: Performance of POMDP Models in Known Environment .
36
4-1-3… …of POMDP vs NBV . . . . . . . . . . . . . .
45
5 Discusion and Conclusion
5-1 Discussion… …1-2
Experiment 2: Performance of POMDP Models in Known Environment .
48
5-1-3…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Van den Hof, W. D. (. (2014). Robot Search in Unknown Environments using POMDPs. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:5902cc25-6d45-4ea2-9a8a-4098874e8444
Chicago Manual of Style (16th Edition):
Van den Hof, W D (author). “Robot Search in Unknown Environments using POMDPs.” 2014. Masters Thesis, Delft University of Technology. Accessed March 01, 2021.
http://resolver.tudelft.nl/uuid:5902cc25-6d45-4ea2-9a8a-4098874e8444.
MLA Handbook (7th Edition):
Van den Hof, W D (author). “Robot Search in Unknown Environments using POMDPs.” 2014. Web. 01 Mar 2021.
Vancouver:
Van den Hof WD(. Robot Search in Unknown Environments using POMDPs. [Internet] [Masters thesis]. Delft University of Technology; 2014. [cited 2021 Mar 01].
Available from: http://resolver.tudelft.nl/uuid:5902cc25-6d45-4ea2-9a8a-4098874e8444.
Council of Science Editors:
Van den Hof WD(. Robot Search in Unknown Environments using POMDPs. [Masters Thesis]. Delft University of Technology; 2014. Available from: http://resolver.tudelft.nl/uuid:5902cc25-6d45-4ea2-9a8a-4098874e8444

University of Windsor
23.
Small, Brian.
An Approach for Intention-Driven, Dialogue-Based Web Search.
Degree: MS, Computer Science, 2012, University of Windsor
URL: https://scholar.uwindsor.ca/etd/5411
► Web search engines facilitate the achievement of Web-mediated tasks, including information retrieval, Web page navigation, and online transactions. These tasks often involve goals that pertain…
(more)
▼ Web search engines facilitate the achievement of Web-mediated tasks, including information retrieval, Web page navigation, and online transactions. These tasks often involve goals that pertain to multiple topics, or domains. Current search engines are not suitable for satisfying complex, multi-domain needs due to their lack of interactivity and knowledge. This thesis presents a novel intention-driven, dialogue-based Web search approach that uncovers and combines users' multi-domain goals to provide helpful virtual assistance. The intention discovery procedure uses a hierarchy of Partially Observable Markov Decision Process-based dialogue managers and a backing knowledge base to systematically explore the dialogue's information space, probabilistically refining the perception of user goals. The search approach has been implemented in IDS, a search engine for online gift shopping. A usability study comparing IDS-based searching with Google-based searching found that the IDS-based approach takes significantly less time and effort, and results in higher user confidence in the retrieved results.
Advisors/Committee Members: Xiaobu Yuan.
Subjects/Keywords: dialogue; information retrieval; multi-domain; POMDP; virtual assistance; Web search
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Small, B. (2012). An Approach for Intention-Driven, Dialogue-Based Web Search. (Masters Thesis). University of Windsor. Retrieved from https://scholar.uwindsor.ca/etd/5411
Chicago Manual of Style (16th Edition):
Small, Brian. “An Approach for Intention-Driven, Dialogue-Based Web Search.” 2012. Masters Thesis, University of Windsor. Accessed March 01, 2021.
https://scholar.uwindsor.ca/etd/5411.
MLA Handbook (7th Edition):
Small, Brian. “An Approach for Intention-Driven, Dialogue-Based Web Search.” 2012. Web. 01 Mar 2021.
Vancouver:
Small B. An Approach for Intention-Driven, Dialogue-Based Web Search. [Internet] [Masters thesis]. University of Windsor; 2012. [cited 2021 Mar 01].
Available from: https://scholar.uwindsor.ca/etd/5411.
Council of Science Editors:
Small B. An Approach for Intention-Driven, Dialogue-Based Web Search. [Masters Thesis]. University of Windsor; 2012. Available from: https://scholar.uwindsor.ca/etd/5411

Clemson University
24.
Yu, Lu.
Stochastic Tools for Network Security: Anonymity Protocol Analysis and Network Intrusion Detection.
Degree: PhD, Electrical Engineering, 2012, Clemson University
URL: https://tigerprints.clemson.edu/all_dissertations/1061
► With the rapid development of Internet and the sharp increase of network crime, network security has become very important and received a lot of attention.…
(more)
▼ With the rapid development of Internet and the sharp increase of network crime, network security has become very important and received a lot of attention. In this dissertation, we model security issues as stochastic systems. This allows us to find weaknesses in existing security systems and propose new solutions. Exploring the vulnerabilities of existing security tools can prevent cyber-attacks from taking advantages of the system weaknesses. We consider The Onion Router (Tor), which is one of the most popular anonymity systems in use today, and show how to detect a protocol tunnelled through Tor. A hidden Markov model (HMM) is used to represent the protocol. Hidden Markov models are statistical models of sequential data like network traffic, and are an effective tool for pattern analysis. New, flexible and adaptive security schemes are needed to cope with emerging security threats. We propose a hybrid network security scheme including intrusion detection systems (IDSs) and honeypots scattered throughout the network. This combines the advantages of two security technologies. A honeypot is an activity-based network security system, which could be the logical supplement of the passive detection policies used by IDSs. This integration forces us to balance security performance versus cost by scheduling device activities for the proposed system. By formulating the scheduling problem as a decentralized partially observable Markov decision process (DEC-
POMDP), decisions are made in a distributed manner at each device without requiring centralized control. When using a HMM, it is important to ensure that it accurately represents both the data used to train the model and the underlying process. Current methods assume that observations used to construct a HMM completely represent the underlying process. It is often the case that the training data size is not large enough to adequately capture all statistical dependencies in the system. It is therefore important to know the statistical significance level that the constructed model represents the underlying process, not only the training set. We present a method to determine if the observation data and constructed model fully express the underlying process with a given level of statistical significance. We apply this approach to detecting the existence of protocols tunnelled through Tor. While HMMs are a powerful tool for representing patterns allowing for uncertainties, they cannot be used for system control. The partially observable Markov decision process (
POMDP) is a useful choice for controlling stochastic systems. As a combination of two Markov models, POMDPs combine the strength of HMM (capturing dynamics that depend on unobserved states) and that of Markov decision process (MDP) (taking the decision aspect into account). Decision making under uncertainty is used in many parts of business and science. We use here for security tools. We propose three approximation methods for discrete-time infinite-horizon…
Advisors/Committee Members: Yu, Lu, Brooks , Richard R, Hoover , Adam, Walker , Ian, Chowdhury , Mashrur.
Subjects/Keywords: network security; optimization; POMDP; stochastic process; Tor; Electrical and Computer Engineering
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Yu, L. (2012). Stochastic Tools for Network Security: Anonymity Protocol Analysis and Network Intrusion Detection. (Doctoral Dissertation). Clemson University. Retrieved from https://tigerprints.clemson.edu/all_dissertations/1061
Chicago Manual of Style (16th Edition):
Yu, Lu. “Stochastic Tools for Network Security: Anonymity Protocol Analysis and Network Intrusion Detection.” 2012. Doctoral Dissertation, Clemson University. Accessed March 01, 2021.
https://tigerprints.clemson.edu/all_dissertations/1061.
MLA Handbook (7th Edition):
Yu, Lu. “Stochastic Tools for Network Security: Anonymity Protocol Analysis and Network Intrusion Detection.” 2012. Web. 01 Mar 2021.
Vancouver:
Yu L. Stochastic Tools for Network Security: Anonymity Protocol Analysis and Network Intrusion Detection. [Internet] [Doctoral dissertation]. Clemson University; 2012. [cited 2021 Mar 01].
Available from: https://tigerprints.clemson.edu/all_dissertations/1061.
Council of Science Editors:
Yu L. Stochastic Tools for Network Security: Anonymity Protocol Analysis and Network Intrusion Detection. [Doctoral Dissertation]. Clemson University; 2012. Available from: https://tigerprints.clemson.edu/all_dissertations/1061

University of Waterloo
25.
Salmon, Ricardo.
On the relationship between satisfiability and partially observable Markov decision processes.
Degree: 2018, University of Waterloo
URL: http://hdl.handle.net/10012/13951
► Stochastic satisfiability (SSAT), Quantified Boolean Satisfiability (QBF) and decision-theoretic planning in finite horizon partially observable Markov decision processes (POMDPs) are all PSPACE-Complete problems. Since they…
(more)
▼ Stochastic satisfiability (SSAT), Quantified Boolean Satisfiability (QBF) and decision-theoretic planning in finite horizon partially observable Markov decision processes (POMDPs) are all PSPACE-Complete problems. Since they are all complete for the same complexity class, I show how to convert them into one another in polynomial time and space. I discuss various properties of each encoding and how they get translated into equivalent constructs in the other encodings. An important lesson of these reductions is that the states in SSAT and flat POMDPs do not match. Therefore, comparing the scalability of satisfiability and flat POMDP solvers based on the size of the state spaces they can tackle is misleading.
A new SSAT solver called SSAT-Prime is proposed and implemented. It includes improvements to watch literals, component caching and detecting symmetries with upper and lower bounds under certain conditions. SSAT-Prime is compared against a state of the art solver for probabilistic inference and a native POMDP solver on challenging benchmarks.
Subjects/Keywords: POMDP; Stochastic SAT; Satisfiability; Planning; Probabilistic Inference; SAT; QBF
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Salmon, R. (2018). On the relationship between satisfiability and partially observable Markov decision processes. (Thesis). University of Waterloo. Retrieved from http://hdl.handle.net/10012/13951
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Salmon, Ricardo. “On the relationship between satisfiability and partially observable Markov decision processes.” 2018. Thesis, University of Waterloo. Accessed March 01, 2021.
http://hdl.handle.net/10012/13951.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Salmon, Ricardo. “On the relationship between satisfiability and partially observable Markov decision processes.” 2018. Web. 01 Mar 2021.
Vancouver:
Salmon R. On the relationship between satisfiability and partially observable Markov decision processes. [Internet] [Thesis]. University of Waterloo; 2018. [cited 2021 Mar 01].
Available from: http://hdl.handle.net/10012/13951.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Salmon R. On the relationship between satisfiability and partially observable Markov decision processes. [Thesis]. University of Waterloo; 2018. Available from: http://hdl.handle.net/10012/13951
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Windsor
26.
Szucs, Tristan.
Lip Synchronization for ECA Rendering with Self-Adjusted POMDP Policies.
Degree: MS, Computer Science, 2020, University of Windsor
URL: https://scholar.uwindsor.ca/etd/8430
► The recent advancements in virtual reality have allowed for the creation of autonomous agents to aid humans in the retrieval and processing of useful digital…
(more)
▼ The recent advancements in virtual reality have allowed for the creation of autonomous agents to aid humans in the retrieval and processing of useful digital information or to aid humans in requesting tasks to be completed by these autonomous agents. Known as embodied conversational agents (ECA), these intelligent agents bridge the physical and virtual worlds by providing natural verbal and non-verbal forms of communication with the user. To provide a positive user experience, it is essential for an ECA not only to appear human-like but also correctly identify the user’s intention so the ECA can correctly assist the user. This thesis continues the research done by our research group investigating the further improvement of
POMDP-based dialogue management using machine learning on POMDP’s belief state history. This thesis integrates a technique to match lip movements with the rendered ECA audio alongside the automatically selected emotion. Finally, this research conducts experiments using machine learning techniques to adjust
POMDP policies and compare its effectiveness in terms of dialogue lengths and successful intention discovery rates.
Advisors/Committee Members: Xiaobu Yuan.
Subjects/Keywords: Dialogue Management; ECA; Expressive; POMDP; Q-Learning; Virtual Reality
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Szucs, T. (2020). Lip Synchronization for ECA Rendering with Self-Adjusted POMDP Policies. (Masters Thesis). University of Windsor. Retrieved from https://scholar.uwindsor.ca/etd/8430
Chicago Manual of Style (16th Edition):
Szucs, Tristan. “Lip Synchronization for ECA Rendering with Self-Adjusted POMDP Policies.” 2020. Masters Thesis, University of Windsor. Accessed March 01, 2021.
https://scholar.uwindsor.ca/etd/8430.
MLA Handbook (7th Edition):
Szucs, Tristan. “Lip Synchronization for ECA Rendering with Self-Adjusted POMDP Policies.” 2020. Web. 01 Mar 2021.
Vancouver:
Szucs T. Lip Synchronization for ECA Rendering with Self-Adjusted POMDP Policies. [Internet] [Masters thesis]. University of Windsor; 2020. [cited 2021 Mar 01].
Available from: https://scholar.uwindsor.ca/etd/8430.
Council of Science Editors:
Szucs T. Lip Synchronization for ECA Rendering with Self-Adjusted POMDP Policies. [Masters Thesis]. University of Windsor; 2020. Available from: https://scholar.uwindsor.ca/etd/8430

University of Central Florida
27.
Coaguila, Quiquia Rey.
Autonomous Quadcopter Videographer.
Degree: 2015, University of Central Florida
URL: https://stars.library.ucf.edu/etd/64
► In recent years, the interest in quadcopters as a robotics platform for autonomous photography has increased. This is due to their small size and mobility,…
(more)
▼ In recent years, the interest in quadcopters as a robotics platform for autonomous photography has increased. This is due to their small size and mobility, which allow them to reach places that are difficult or even impossible for humans. This thesis focuses on the design of an autonomous quadcopter videographer, i.e. a quadcopter capable of capturing good footage of a specific
subject. In order to obtain this footage, the system needs to choose appropriate vantage points and control the quadcopter. Skilled human videographers can easily spot good filming locations where the
subject and its actions can be seen clearly in the resulting video footage, but translating this knowledge to a robot can be complex. We present an autonomous system implemented on a commercially available quadcopter that achieves this using only the monocular information and an accelerometer. Our system has two vantage point selection strategies: 1) a reactive approach, which moves the robot to a fixed location with respect to the human and 2) the combination of the reactive approach and a
POMDP planner that considers the target's movement intentions. We compare the behavior of these two approaches under different target movement scenarios. The results show that the
POMDP planner obtains more stable footage with less quadcopter motion.
Advisors/Committee Members: Sukthankar, Gita.
Subjects/Keywords: Quadcopter; videographer; photography; robotics; artificial intelligence; applications; pomdp; Computer Sciences; Engineering
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Coaguila, Q. R. (2015). Autonomous Quadcopter Videographer. (Masters Thesis). University of Central Florida. Retrieved from https://stars.library.ucf.edu/etd/64
Chicago Manual of Style (16th Edition):
Coaguila, Quiquia Rey. “Autonomous Quadcopter Videographer.” 2015. Masters Thesis, University of Central Florida. Accessed March 01, 2021.
https://stars.library.ucf.edu/etd/64.
MLA Handbook (7th Edition):
Coaguila, Quiquia Rey. “Autonomous Quadcopter Videographer.” 2015. Web. 01 Mar 2021.
Vancouver:
Coaguila QR. Autonomous Quadcopter Videographer. [Internet] [Masters thesis]. University of Central Florida; 2015. [cited 2021 Mar 01].
Available from: https://stars.library.ucf.edu/etd/64.
Council of Science Editors:
Coaguila QR. Autonomous Quadcopter Videographer. [Masters Thesis]. University of Central Florida; 2015. Available from: https://stars.library.ucf.edu/etd/64
28.
Zheltova, Ludmila.
STRUCTURED MAINTENANCE POLICIES ON INTERIOR SAMPLE
PATHS.
Degree: PhD, Operations, 2010, Case Western Reserve University School of Graduate Studies
URL: http://rave.ohiolink.edu/etdc/view?acc_num=case1264627939
► This dissertation has three chapters related to the maintenance optimization.In the first chapter, we examine the problem of adaptively scheduling perfectinspections and preventive replacement…
(more)
▼ This dissertation has three chapters related
to the maintenance optimization.In the first chapter, we examine
the problem of adaptively scheduling perfectinspections and
preventive replacement for a multi-state, Markovian
deteriorationsystem with silent failures, such that total expected
discounted cost is minimized.We model this problem as a partially
observed Markov decision process for whichthree actions - replace,
do nothing, perfectly inspect - are available at each decision
epoch, and establish structural properties of the optimal policy
for certain nonextreme sample paths. The second
chapter also considers discrete -time, Markovian deterioration
system.However, while in the first chapter, we consider maintenance
action withperfect outcome, that is, as a result of replacement,
the system transits to “as good as new state”, in the second
chapter, we explore optimal policy structure forsuch a system in
the case of stochastic repair. In the third
chapter, we consider discrete-time, nonstationary Markovian
deterioration system with the same set of actions as in the first
chapter. We investigate the structure of the optimal maintenance
policy for such a system by minimizing the expected total
discounted cost over an infinite horizon.
Advisors/Committee Members: Sobel, Matthew (Committee Chair).
Subjects/Keywords: Operations Research; Markov; POMDP
…Section 1.2, we formulate a
partially observed Markov decision process (POMDP) model… …in Rosenfield [17].
There are some recent research result that consider POMDP and…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zheltova, L. (2010). STRUCTURED MAINTENANCE POLICIES ON INTERIOR SAMPLE
PATHS. (Doctoral Dissertation). Case Western Reserve University School of Graduate Studies. Retrieved from http://rave.ohiolink.edu/etdc/view?acc_num=case1264627939
Chicago Manual of Style (16th Edition):
Zheltova, Ludmila. “STRUCTURED MAINTENANCE POLICIES ON INTERIOR SAMPLE
PATHS.” 2010. Doctoral Dissertation, Case Western Reserve University School of Graduate Studies. Accessed March 01, 2021.
http://rave.ohiolink.edu/etdc/view?acc_num=case1264627939.
MLA Handbook (7th Edition):
Zheltova, Ludmila. “STRUCTURED MAINTENANCE POLICIES ON INTERIOR SAMPLE
PATHS.” 2010. Web. 01 Mar 2021.
Vancouver:
Zheltova L. STRUCTURED MAINTENANCE POLICIES ON INTERIOR SAMPLE
PATHS. [Internet] [Doctoral dissertation]. Case Western Reserve University School of Graduate Studies; 2010. [cited 2021 Mar 01].
Available from: http://rave.ohiolink.edu/etdc/view?acc_num=case1264627939.
Council of Science Editors:
Zheltova L. STRUCTURED MAINTENANCE POLICIES ON INTERIOR SAMPLE
PATHS. [Doctoral Dissertation]. Case Western Reserve University School of Graduate Studies; 2010. Available from: http://rave.ohiolink.edu/etdc/view?acc_num=case1264627939
29.
Pinault, Florian.
Apprentissage par renforcement pour la généralisation des approches automatiques dans la conception des systèmes de dialogue oral : Statistical methods for a oral human-machine dialog system.
Degree: Docteur es, Informatique, 2011, Avignon
URL: http://www.theses.fr/2011AVIG0188
► Les systèmes de dialogue homme machine actuellement utilisés dans l’industrie sont fortement limités par une forme de communication très rigide imposant à l’utilisateur de suivre…
(more)
▼ Les systèmes de dialogue homme machine actuellement utilisés dans l’industrie sont fortement limités par une forme de communication très rigide imposant à l’utilisateur de suivre la logique du concepteur du système. Cette limitation est en partie due à leur représentation de l’état de dialogue sous la forme de formulaires préétablis.Pour répondre à cette difficulté, nous proposons d’utiliser une représentation sémantique à structure plus riche et flexible visant à permettre à l’utilisateur de formuler librement sa demande.Une deuxième difficulté qui handicape grandement les systèmes de dialogue est le fort taux d’erreur du système de reconnaissance vocale. Afin de traiter ces erreurs de manière quantitative, la volonté de réaliser une planification de stratégie de dialogue en milieu incertain a conduit à utiliser des méthodes d’apprentissage par renforcement telles que les processus de décision de Markov partiellement observables (POMDP). Mais un inconvénient du paradigme POMDP est sa trop grande complexité algorithmique. Certaines propositions récentes permettent de réduire la complexité du modèle. Mais elles utilisent une représentation en formulaire et ne peuvent être appliqués directement à la représentation sémantique riche que nous proposons d’utiliser.Afin d’appliquer le modèle POMDP dans un système dont le modèle sémantique est complexe, nous proposons une nouvelle façon de contrôler sa complexité en introduisant un nouveau paradigme : le POMDP résumé à double suivi de la croyance. Dans notre proposition, le POMDP maitre, complexe, est transformé en un POMDP résumé, plus simple. Un premier suivi de croyance (belief update) est réalisé dans l’espace maitre (en intégrant des observations probabilistes sous forme de listes nbest). Et un second suivi de croyance est réalisé dans l’espace résumé, les stratégies obtenues sont ainsi optimisées sur un véritable POMDP.Nous proposons deux méthodes pour définir la projection du POMDP maitre en un POMDP résumé : par des règles manuelles et par regroupement automatique par k plus proches voisins. Pour cette dernière, nous proposons d’utiliser la distance d’édition entre graphes, que nous généralisons pour obtenir une distance entre listes nbest.En outre, le couplage entre un système résumé, reposant sur un modèle statistique par POMDP, et un système expert, reposant sur des règles ad hoc, fournit un meilleur contrôle sur la stratégie finale. Ce manque de contrôle est en effet une des faiblesses empêchant l’adoption des POMDP pour le dialogue dans l’industrie.Dans le domaine du renseignement d’informations touristiques et de la réservation de chambres d’hôtel, les résultats sur des dialogues simulés montrent l’efficacité de l’approche par renforcement associée à un système de règles pour s’adapter à un environnement bruité. Les tests réels sur des utilisateurs humains montrent qu’un système optimisé par renforcement obtient cependant de meilleures performances sur le critère pour lequel il a été optimisé.
Dialog managers (DM) in spoken dialogue systems make decisions in…
Advisors/Committee Members: Lefèvre, Fabrice (thesis director).
Subjects/Keywords: POMDP; Dialogue; Interface homme-machine; Apprentissage par renforcement; Méthodes statistiques; Frames sémantiques; POMDP; Dialog system; Dialogue; Reinforcement learning; IHM; Statistics; Semantics frames; Structured semantics; 006.454
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Pinault, F. (2011). Apprentissage par renforcement pour la généralisation des approches automatiques dans la conception des systèmes de dialogue oral : Statistical methods for a oral human-machine dialog system. (Doctoral Dissertation). Avignon. Retrieved from http://www.theses.fr/2011AVIG0188
Chicago Manual of Style (16th Edition):
Pinault, Florian. “Apprentissage par renforcement pour la généralisation des approches automatiques dans la conception des systèmes de dialogue oral : Statistical methods for a oral human-machine dialog system.” 2011. Doctoral Dissertation, Avignon. Accessed March 01, 2021.
http://www.theses.fr/2011AVIG0188.
MLA Handbook (7th Edition):
Pinault, Florian. “Apprentissage par renforcement pour la généralisation des approches automatiques dans la conception des systèmes de dialogue oral : Statistical methods for a oral human-machine dialog system.” 2011. Web. 01 Mar 2021.
Vancouver:
Pinault F. Apprentissage par renforcement pour la généralisation des approches automatiques dans la conception des systèmes de dialogue oral : Statistical methods for a oral human-machine dialog system. [Internet] [Doctoral dissertation]. Avignon; 2011. [cited 2021 Mar 01].
Available from: http://www.theses.fr/2011AVIG0188.
Council of Science Editors:
Pinault F. Apprentissage par renforcement pour la généralisation des approches automatiques dans la conception des systèmes de dialogue oral : Statistical methods for a oral human-machine dialog system. [Doctoral Dissertation]. Avignon; 2011. Available from: http://www.theses.fr/2011AVIG0188
30.
Habachi, Oussama.
Optimisation des Systèmes Partiellement Observables dans les Réseaux Sans-fil : Théorie des jeux, Auto-adaptation et Apprentissage : Optimization of Partially Observable Systems in Wireless Networks : Game Theory, Self-adaptivity and Learning.
Degree: Docteur es, Informatique, 2012, Avignon
URL: http://www.theses.fr/2012AVIG0179
► La dernière décennie a vu l'émergence d'Internet et l'apparition des applications multimédia qui requièrent de plus en plus de bande passante, ainsi que des utilisateurs…
(more)
▼ La dernière décennie a vu l'émergence d'Internet et l'apparition des applications multimédia qui requièrent de plus en plus de bande passante, ainsi que des utilisateurs qui exigent une meilleure qualité de service. Dans cette perspective, beaucoup de travaux ont été effectués pour améliorer l'utilisation du spectre sans fil.Le sujet de ma thèse de doctorat porte sur l'application de la théorie des jeux, la théorie des files d'attente et l'apprentissage dans les réseaux sans fil,en particulier dans des environnements partiellement observables. Nous considérons différentes couches du modèle OSI. En effet, nous étudions l'accès opportuniste au spectre sans fil à la couche MAC en utilisant la technologie des radios cognitifs (CR). Par la suite, nous nous concentrons sur le contrôle de congestion à la couche transport, et nous développons des mécanismes de contrôle de congestion pour le protocole TCP.
Since delay-sensitive and bandwidth-intense multimedia applications have emerged in the Internet, the demand for network resources has seen a steady increase during the last decade. Specifically, wireless networks have become pervasive and highly populated.These motivations are behind the problems considered in this dissertation.The topic of my PhD is about the application of game theory, queueing theory and learning techniques in wireless networks under some QoS constraints, especially in partially observable environments.We consider different layers of the protocol stack. In fact, we study the Opportunistic Spectrum Access (OSA) at the Medium Access Control (MAC) layer through Cognitive Radio (CR) approaches.Thereafter, we focus on the congestion control at the transport layer, and we develop some congestion control mechanisms under the TCP protocol.The roadmap of the research is as follows. Firstly, we focus on the MAC layer, and we seek for optimal OSA strategies in CR networks. We consider that Secondary Users (SUs) take advantage of opportunities in licensed channels while ensuring a minimum level of QoS. In fact, SUs have the possibility to sense and access licensed channels, or to transmit their packets using a dedicated access (like 3G). Therefore, a SU has two conflicting goals: seeking for opportunities in licensed channels, but spending energy for sensing those channels, or transmitting over the dedicated channel without sensing, but with higher transmission delay. We model the slotted and the non-slotted systems using a queueing framework. Thereafter, we analyze the non-cooperative behavior of SUs, and we prove the existence of a Nash equilibrium (NE) strategy. Moreover, we measure the gap of performance between the centralized and the decentralized systems using the Price of Anarchy (PoA).Even if the OSA at the MAC layer was deeply investigated in the last decade, the performance of SUs, such as energy consumption or Quality of Service (QoS) guarantee, was somehow ignored. Therefore, we study the OSA taking into account energy consumption and delay. We consider, first, one SU that access opportunistically…
Advisors/Committee Members: Altman, Eitan (thesis director).
Subjects/Keywords: Théorie des jeux; Apprentissage; Auto organisation; Processus de décision Markoviens; POMDP; POSG; Evaluation de performances; Game theory; Learning; Self-adaptivity; Markov decision process; POMDP; POSG; 519.3; 621.382; 004.6
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Habachi, O. (2012). Optimisation des Systèmes Partiellement Observables dans les Réseaux Sans-fil : Théorie des jeux, Auto-adaptation et Apprentissage : Optimization of Partially Observable Systems in Wireless Networks : Game Theory, Self-adaptivity and Learning. (Doctoral Dissertation). Avignon. Retrieved from http://www.theses.fr/2012AVIG0179
Chicago Manual of Style (16th Edition):
Habachi, Oussama. “Optimisation des Systèmes Partiellement Observables dans les Réseaux Sans-fil : Théorie des jeux, Auto-adaptation et Apprentissage : Optimization of Partially Observable Systems in Wireless Networks : Game Theory, Self-adaptivity and Learning.” 2012. Doctoral Dissertation, Avignon. Accessed March 01, 2021.
http://www.theses.fr/2012AVIG0179.
MLA Handbook (7th Edition):
Habachi, Oussama. “Optimisation des Systèmes Partiellement Observables dans les Réseaux Sans-fil : Théorie des jeux, Auto-adaptation et Apprentissage : Optimization of Partially Observable Systems in Wireless Networks : Game Theory, Self-adaptivity and Learning.” 2012. Web. 01 Mar 2021.
Vancouver:
Habachi O. Optimisation des Systèmes Partiellement Observables dans les Réseaux Sans-fil : Théorie des jeux, Auto-adaptation et Apprentissage : Optimization of Partially Observable Systems in Wireless Networks : Game Theory, Self-adaptivity and Learning. [Internet] [Doctoral dissertation]. Avignon; 2012. [cited 2021 Mar 01].
Available from: http://www.theses.fr/2012AVIG0179.
Council of Science Editors:
Habachi O. Optimisation des Systèmes Partiellement Observables dans les Réseaux Sans-fil : Théorie des jeux, Auto-adaptation et Apprentissage : Optimization of Partially Observable Systems in Wireless Networks : Game Theory, Self-adaptivity and Learning. [Doctoral Dissertation]. Avignon; 2012. Available from: http://www.theses.fr/2012AVIG0179
◁ [1] [2] [3] ▶
.