You searched for subject:(Markov decision process)
.
Showing records 1 – 30 of
282 total matches.
◁ [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] ▶

University of Manitoba
1.
Gunasekara, Charith.
Optimal threshold policy for opportunistic network coding under phase type arrivals.
Degree: Electrical and Computer Engineering, 2016, University of Manitoba
URL: http://hdl.handle.net/1993/31615
► Network coding allows each node in a network to perform some coding operations on the data packets and improve the overall throughput of communication. However,…
(more)
▼ Network coding allows each node in a network to perform some coding operations on the data packets and improve the overall throughput of communication. However, network coding cannot be done unless there are enough packets to be coded so at times it may be advantageous to wait for packets to arrive.
We consider a scenario in which two wireless nodes each with its own buffer communicate via a single access point using network coding. The access point first pairs each data packet being sent from each node and then performs the network coding operation. Packets arriving at the access point that are unable to be paired are instead loaded into one of the two buffers at the access point. In the case where one of the buffers is empty and the other is not network coding is not possible. When this happens the access point must either wait for a network coding opportunity, or transmit the unpaired packet without coding. Delaying packet transmission is associated with an increased waiting cost but also allows for an increase in the overall efficiency of wireless spectrum usage, thus a decrease in packet transmission cost. Conversely, sending packets un-coded is associated with a decrease in waiting cost but also a decrease in the overall efficiency of the wireless spectrum usage. Hence, there is a trade-off between decreasing packet delay time, and increasing the efficiency of the wireless spectrum usage.
We show that the optimal waiting policy for this system with respect to total cost, under phase-type packet arrivals, is to have a separate threshold for the buffer size that is dependent on the current phase of each arrival. We then show that the solution to this optimization problem can be obtained by treating it as a double ended push-out queueing theory problem. We develop a new technique to keep track of the packet waiting time and the number of packets waiting in the two ended push-out queue. We use the resulting queueing model to resolve the optimal threshold policy and then analyze the performance of the system using numerical approach.
Advisors/Committee Members: Alfa, Attahiru (Electrical and Computer Engineering), Yahampath, Pradeepa (Electrical and Computer Engineering) (supervisor), Cai, Jun (Electrical and Computer Engineering), Thulasiraman, Parimala (Computer Science), Down, Douglas (McMaster University) (examiningcommittee).
Subjects/Keywords: Queueing Theory; Network Coding; Markov Decision Process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Gunasekara, C. (2016). Optimal threshold policy for opportunistic network coding under phase type arrivals. (Thesis). University of Manitoba. Retrieved from http://hdl.handle.net/1993/31615
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Gunasekara, Charith. “Optimal threshold policy for opportunistic network coding under phase type arrivals.” 2016. Thesis, University of Manitoba. Accessed April 11, 2021.
http://hdl.handle.net/1993/31615.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Gunasekara, Charith. “Optimal threshold policy for opportunistic network coding under phase type arrivals.” 2016. Web. 11 Apr 2021.
Vancouver:
Gunasekara C. Optimal threshold policy for opportunistic network coding under phase type arrivals. [Internet] [Thesis]. University of Manitoba; 2016. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/1993/31615.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Gunasekara C. Optimal threshold policy for opportunistic network coding under phase type arrivals. [Thesis]. University of Manitoba; 2016. Available from: http://hdl.handle.net/1993/31615
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Minnesota
2.
Liu, Yuhang.
Optimal serving schedules for multiple queues with
size-independent service times.
Degree: MS, Industrial and Systems Engineering, 2013, University of Minnesota
URL: http://purl.umn.edu/160186
► University of Minnesota M.S. thesis. May 2013. Major: Industrial and Systems Engineering. Advisor: Zizhuo Wang. 1 computer file (PDF); v, 37 pages.
We consider a…
(more)
▼ University of Minnesota M.S. thesis. May 2013.
Major: Industrial and Systems Engineering. Advisor: Zizhuo Wang. 1
computer file (PDF); v, 37 pages.
We consider a service system with two Poisson
arrival queues. There is a single server that chooses which queue
to serve at each moment. Once a queue is served, all the customers
are served within a fixed time. This model is useful in studying
airport shuttling or certain online computing systems. In this
thesis, we first establish a Markov. Decision Process (MDP) model
for this problem and study its structures. We then propose a simple
yet optimal state-independent policy for this problem which is not
only easy to implement, but also performs very well. If the service
time of both queues equals to one unit of time, we prove that the
optimal state-independent policy has the following structure: serve
the queue with the smaller arrival rate once followed by serving
the other queue k times, and we obtain an explicit formula to
capture k. We conduct numerical tests for our policy and it
performs very well. We also extend our discussions to a more
general case in which the service time of the queues can be any
integer. We also obtain the optimal the optimal state-independent
policies in that case.
Advisors/Committee Members: Zizhuo Wang.
Subjects/Keywords: Markov decision process; Queuing theory; Traffic schedule
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Liu, Y. (2013). Optimal serving schedules for multiple queues with
size-independent service times. (Masters Thesis). University of Minnesota. Retrieved from http://purl.umn.edu/160186
Chicago Manual of Style (16th Edition):
Liu, Yuhang. “Optimal serving schedules for multiple queues with
size-independent service times.” 2013. Masters Thesis, University of Minnesota. Accessed April 11, 2021.
http://purl.umn.edu/160186.
MLA Handbook (7th Edition):
Liu, Yuhang. “Optimal serving schedules for multiple queues with
size-independent service times.” 2013. Web. 11 Apr 2021.
Vancouver:
Liu Y. Optimal serving schedules for multiple queues with
size-independent service times. [Internet] [Masters thesis]. University of Minnesota; 2013. [cited 2021 Apr 11].
Available from: http://purl.umn.edu/160186.
Council of Science Editors:
Liu Y. Optimal serving schedules for multiple queues with
size-independent service times. [Masters Thesis]. University of Minnesota; 2013. Available from: http://purl.umn.edu/160186

Iowa State University
3.
Bertram, Joshua R.
A new solution for Markov Decision Processes and its aerospace applications.
Degree: 2020, Iowa State University
URL: https://lib.dr.iastate.edu/etd/17832
► Markov Decision Processes (MDPs) are a powerful technique for modelling sequential decisionmaking problems which have been used over many decades to solve problems including robotics,finance,…
(more)
▼ Markov Decision Processes (MDPs) are a powerful technique for modelling sequential decisionmaking problems which have been used over many decades to solve problems including robotics,finance, and aerospace domains. However, MDPs are also known to be difficult to solve due toexplosion in the size of the state space which makes finding their solution intractable for manypractical problems. The traditional approaches such as value iteration required that each state inthe state space is represented as an element in an array, which eventually will exhaust the availablememory of any computer. It is not unusual to find practical problems in which the number ofstates is so large that it will never conceivably be tractable on any computer (e.g., the numberof states is of the order of the number of atoms in the universe.) Historically, this issue has beenmitigated by various means, but primarily by approximation (under the umbrella of ApproximateDynamic Programmming) where the solution of the MDP (the value function) is modelled via anapproximation function. Many linear function approximation methods have been proposed sinceMarkov Decision Processes were proposed nearly 70 years ago. More recently non-linear (e.g. deepneural net) function approximation methods have also been proposed to obtain a higher qualityestimate of the value function. While these methods help, they come with disadvantages includingloss of accuracy caused by the approximation, and a training or fitting phase which may take a longtime to converge
This thesis makes two main contributions in the area of Markov Decision Processes: (1) a novelalternative theoretical understanding of the nature of Markov Decision Processes and their solutions,and (2) a new series of algorithms that can solve a subset of MDPs extremely quickly compared tothe historical methods described above. We provide both an intuitive and mathematical descriptionof the method. We describe a progression of algorithms that demonstrate the utility of the approachin aerospace applications including guidance to goals, collision avoidance, and pursuit evasion. We start in 2D environments with simple aircraft models and end with 3D team-based pursuit evasionwhere the aircraft perform rolls and loops in a highly dynamic environment. We close by providingdiscussion and describing future research
Subjects/Keywords: FastMDP; Markov Decision Process; MDP; MDPs
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Bertram, J. R. (2020). A new solution for Markov Decision Processes and its aerospace applications. (Thesis). Iowa State University. Retrieved from https://lib.dr.iastate.edu/etd/17832
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Bertram, Joshua R. “A new solution for Markov Decision Processes and its aerospace applications.” 2020. Thesis, Iowa State University. Accessed April 11, 2021.
https://lib.dr.iastate.edu/etd/17832.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Bertram, Joshua R. “A new solution for Markov Decision Processes and its aerospace applications.” 2020. Web. 11 Apr 2021.
Vancouver:
Bertram JR. A new solution for Markov Decision Processes and its aerospace applications. [Internet] [Thesis]. Iowa State University; 2020. [cited 2021 Apr 11].
Available from: https://lib.dr.iastate.edu/etd/17832.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Bertram JR. A new solution for Markov Decision Processes and its aerospace applications. [Thesis]. Iowa State University; 2020. Available from: https://lib.dr.iastate.edu/etd/17832
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Minnesota
4.
Liu, Yuhang.
Optimal serving schedules for multiple queues with size-independent service times.
Degree: MS, Industrial and Systems Engineering, 2013, University of Minnesota
URL: http://purl.umn.edu/160186
► We consider a service system with two Poisson arrival queues. There is a single server that chooses which queue to serve at each moment. Once…
(more)
▼ We consider a service system with two Poisson arrival queues. There is a single server that chooses which queue to serve at each moment. Once a queue is served, all the customers are served within a fixed time. This model is useful in studying airport
shuttling or certain online computing systems. In this thesis, we first establish a Markov.
Decision Process (MDP) model for this problem and study its structures. We then
propose a simple yet optimal state-independent policy for this problem which is not
only easy to implement, but also performs very well. If the service time of both queues
equals to one unit of time, we prove that the optimal state-independent policy has
the following structure: serve the queue with the smaller arrival rate once followed
by serving the other queue k times, and we obtain an explicit formula to capture k.
We conduct numerical tests for our policy and it performs very well. We also extend
our discussions to a more general case in which the service time of the queues can be
any integer. We also obtain the optimal the optimal state-independent policies in that
case.
Subjects/Keywords: Markov decision process; Queuing theory; Traffic schedule
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Liu, Y. (2013). Optimal serving schedules for multiple queues with size-independent service times. (Masters Thesis). University of Minnesota. Retrieved from http://purl.umn.edu/160186
Chicago Manual of Style (16th Edition):
Liu, Yuhang. “Optimal serving schedules for multiple queues with size-independent service times.” 2013. Masters Thesis, University of Minnesota. Accessed April 11, 2021.
http://purl.umn.edu/160186.
MLA Handbook (7th Edition):
Liu, Yuhang. “Optimal serving schedules for multiple queues with size-independent service times.” 2013. Web. 11 Apr 2021.
Vancouver:
Liu Y. Optimal serving schedules for multiple queues with size-independent service times. [Internet] [Masters thesis]. University of Minnesota; 2013. [cited 2021 Apr 11].
Available from: http://purl.umn.edu/160186.
Council of Science Editors:
Liu Y. Optimal serving schedules for multiple queues with size-independent service times. [Masters Thesis]. University of Minnesota; 2013. Available from: http://purl.umn.edu/160186
5.
Shirmohammadi, Mahsa.
Qualitative analysis of synchronizing probabilistic systems : Analyse qualitative des systèmes probabilistes synchronisants.
Degree: Docteur es, Informatique, 2014, Cachan, Ecole normale supérieure; Université libre de Bruxelles (1970-....)
URL: http://www.theses.fr/2014DENS0054
► Les Markov Decision Process (MDP) sont des systèmes finis probabilistes avec à la fois des choix aléatoires et des stratégies, et sont ainsi reconnus comme…
(more)
▼ Les
Markov Decision Process (MDP) sont des systèmes finis probabilistes avec à la fois des choix aléatoires et des stratégies, et sont ainsi reconnus comme de puissants outils pour modéliser les interactions entre un contrôleur et les réponses aléatoires de l'environment. Mathématiquement, un MDP peut être vu comme un jeu stochastique à un joueur et demi où le contrôleur choisit à chaque tour une action et l'environment répond en choisissant un successeur selon une distribution de probabilités fixée.Il existe deux incomparables représentations du comportement d'un MDP une fois les choix de la stratégie fixés.Dans la représentation classique, un MDP est un générateur de séquences d'états, appelées state-outcome; les conditions gagnantes du joueur sont ainsi exprimées comme des ensembles de séquences désirables d'états qui sont visités pendant le jeu, e.g. les conditions de Borel telles que l'accessibilité. La complexité des problèmes de décision ainsi que la capacité mémoire requise des stratégies gagnantes pour les conditions dites state-outcome ont été déjà fortement étudiées.Depuis peu, les MDPs sont également considérés comme des générateurs de séquences de distributions de probabilités sur les états, appelées distribution-outcome. Nous introduisons des conditions de synchronisation sur les distributions-outcome, qui intuitivement demandent à ce que la masse de probabilité s'accumule dans un (ensemble d') état, potentiellement de façon asymptotique.Une distribution de probabilités est p-synchrone si la masse de probabilité est d'au moins p dans un état; et la séquence de distributions de probabilités est toujours, éventuellement, faiblement, ou fortement p-synchrone si, respectivement toutes, certaines, infiniment plusieurs ou toutes sauf un nombre fini de distributions dans la séquence sont p-synchrones.Pour chaque type de synchronisation, un MDP peut être(i) assurément gagnant si il existe une stratégie qui génère une séquence 1-synchrone;(ii) presque-assurément gagnant si il existe une stratégie qui génère une séquence (1-epsilon)-synchrone et cela pour tout epsilon strictement positif;(iii) asymptotiquement gagnant si pour tout epsilon strictement positif, il existe une stratégie produisant une séquence (1-epsilon)-synchrone.Nous considérons le problème consistant à décider si un MDP est gagnant, pour chaque type de synchronisation et chaque mode gagnant: nous établissons les limites supérieures et inférieures de la complexité de ces problèmes ainsi que la capacité mémoire requise pour une stratégie gagnante optimale.En outre, nous étudions les problèmes de synchronisation pour les automates probabilistes (PAs) qui sont en fait des instances de MDP où les contrôleurs sont restreint à utiliser uniquement des stratégies-mots; c'est à dire qu'ils n'ont pas la possibilité d'observer l'historique de l'exécution du système et ne peuvent connaitre que le nombre de choix effectués jusque là. Les langages synchrones d'un PA sont donc l'ensemble des stratégies-mots synchrones: nous établissons la complexité des…
Advisors/Committee Members: Doyen, Laurent (thesis director), Massart, Thierry (thesis director).
Subjects/Keywords: Markov decision process; Automates probabilistes; Mots synchronisants; Markov decision process; Probabilistic automata; Synchronising words
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Shirmohammadi, M. (2014). Qualitative analysis of synchronizing probabilistic systems : Analyse qualitative des systèmes probabilistes synchronisants. (Doctoral Dissertation). Cachan, Ecole normale supérieure; Université libre de Bruxelles (1970-....). Retrieved from http://www.theses.fr/2014DENS0054
Chicago Manual of Style (16th Edition):
Shirmohammadi, Mahsa. “Qualitative analysis of synchronizing probabilistic systems : Analyse qualitative des systèmes probabilistes synchronisants.” 2014. Doctoral Dissertation, Cachan, Ecole normale supérieure; Université libre de Bruxelles (1970-....). Accessed April 11, 2021.
http://www.theses.fr/2014DENS0054.
MLA Handbook (7th Edition):
Shirmohammadi, Mahsa. “Qualitative analysis of synchronizing probabilistic systems : Analyse qualitative des systèmes probabilistes synchronisants.” 2014. Web. 11 Apr 2021.
Vancouver:
Shirmohammadi M. Qualitative analysis of synchronizing probabilistic systems : Analyse qualitative des systèmes probabilistes synchronisants. [Internet] [Doctoral dissertation]. Cachan, Ecole normale supérieure; Université libre de Bruxelles (1970-....); 2014. [cited 2021 Apr 11].
Available from: http://www.theses.fr/2014DENS0054.
Council of Science Editors:
Shirmohammadi M. Qualitative analysis of synchronizing probabilistic systems : Analyse qualitative des systèmes probabilistes synchronisants. [Doctoral Dissertation]. Cachan, Ecole normale supérieure; Université libre de Bruxelles (1970-....); 2014. Available from: http://www.theses.fr/2014DENS0054

University of Akron
6.
Chippa, Mukesh K.
Goal-seeking Decision Support System to Empower Personal
Wellness Management.
Degree: PhD, Computer Engineering, 2016, University of Akron
URL: http://rave.ohiolink.edu/etdc/view?acc_num=akron1480413936639467
► Obesity has reached epidemic proportions globally, withmore than one billion adults overweight with at least threehundred million of them clinically obese; this is a major…
(more)
▼ Obesity has reached epidemic proportions globally,
withmore than one billion adults overweight with at least
threehundred million of them clinically obese; this is a major
contributor to the global burden of chronic disease and disability.
This can also be associated with the rising health care costs; in
the USA more than 75% of health care costs relate to chronic
conditions such as Diabetes and Hypertension. While there are
various technological advancements in fitness tracking devices such
as Fitbit, and many employersoffer wellness programs, such programs
and devices havenot been able to create societal scale
transformations in the life style of the users. The challenge in
keeping healthy people healthy and helping them to be intrinsically
motivated to manage their own health is at the focus for this
investigation on Personal Wellness Management. In this
dissertation, this problem is presented as a
decision making under
uncertainty where the participant takes an action at discrete time
steps and the outcome of the action is uncertain. The main focus is
to formulate the decisionmaking problem in the Goal-seeking
framework. To evaluate this formulation, the problem was also
formulated in two classical sequential
decision-making frameworks
–
Markov Decision Process and Partially Observable
Markov
Decision Process. The sequential
decision-making frameworks allow
us to computeoptimal policies to guide the participants' choice of
actions. One of the major challenges in formulating the wellness
management problem in these frameworks is the need for clinically
validated data. While it is unrealistic to find such experimentally
validateddata, it is also not clear that the models in fact capture
all the inconstraints that are necessary to make the optimal
solutions effective for the participant. The Goal-seeking framework
offersan alternative approach that does not require explicit
modeling of the participant or the environment. This dissertation
presents a software system that is designed in the Goal-seeking
framework. The architecture of the system isextensible. A modular
subsystem that is useful to visualize exerciseperformance data that
are gathered from a Kinect camera is described.
Advisors/Committee Members: Sastry, Shivakumar (Advisor).
Subjects/Keywords: Computer Engineering; decision support system, personalized wellness management,
Goal seeking paradigm, markov decision process, partially
observable markov decision process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chippa, M. K. (2016). Goal-seeking Decision Support System to Empower Personal
Wellness Management. (Doctoral Dissertation). University of Akron. Retrieved from http://rave.ohiolink.edu/etdc/view?acc_num=akron1480413936639467
Chicago Manual of Style (16th Edition):
Chippa, Mukesh K. “Goal-seeking Decision Support System to Empower Personal
Wellness Management.” 2016. Doctoral Dissertation, University of Akron. Accessed April 11, 2021.
http://rave.ohiolink.edu/etdc/view?acc_num=akron1480413936639467.
MLA Handbook (7th Edition):
Chippa, Mukesh K. “Goal-seeking Decision Support System to Empower Personal
Wellness Management.” 2016. Web. 11 Apr 2021.
Vancouver:
Chippa MK. Goal-seeking Decision Support System to Empower Personal
Wellness Management. [Internet] [Doctoral dissertation]. University of Akron; 2016. [cited 2021 Apr 11].
Available from: http://rave.ohiolink.edu/etdc/view?acc_num=akron1480413936639467.
Council of Science Editors:
Chippa MK. Goal-seeking Decision Support System to Empower Personal
Wellness Management. [Doctoral Dissertation]. University of Akron; 2016. Available from: http://rave.ohiolink.edu/etdc/view?acc_num=akron1480413936639467

University of Georgia
7.
Perez Barrenechea, Dennis David.
Anytime point based approximations for interactive POMDPs.
Degree: 2014, University of Georgia
URL: http://hdl.handle.net/10724/24464
► Partially observable Markov decision processes (POMDPs) have been largely accepted as a rich-framework for planning and control problems. In settings where multiple agents interact, POMDPs…
(more)
▼ Partially observable Markov decision processes (POMDPs) have been largely accepted as a rich-framework for planning and control problems. In settings where multiple agents interact, POMDPs fail to model other agents explicitly. The
interactive partially observable Markov decision process (I-POMDP) is a new paradigm that extends POMDPs to multiagent settings. The I-POMDP framework models other agents explicitly, making exact solution unfeasible but for the simplest settings. Thus, a
need for good approximation methods arises, methods that could find solutions with tight error bounds and short periods of time. We develop a point based method for solving finitely nested I-POMDPs pproximately. The method maintains a set of belief
points and form value functions including only the value vectors that are optimal at these belief points. Since I-POMDPs computation depends on the prediction of the actions of other agents in multiagent settings, an interactive generalization of the
point based value iteration (PBVI) methods that recursively solves all models of other agents needed to be developed. We present some empirical results in domains on the literature and discuss the computational savings of the proposed
method.
Subjects/Keywords: Markov Decision Process; Multiagent systems; Decision making; POMDP
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Perez Barrenechea, D. D. (2014). Anytime point based approximations for interactive POMDPs. (Thesis). University of Georgia. Retrieved from http://hdl.handle.net/10724/24464
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Perez Barrenechea, Dennis David. “Anytime point based approximations for interactive POMDPs.” 2014. Thesis, University of Georgia. Accessed April 11, 2021.
http://hdl.handle.net/10724/24464.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Perez Barrenechea, Dennis David. “Anytime point based approximations for interactive POMDPs.” 2014. Web. 11 Apr 2021.
Vancouver:
Perez Barrenechea DD. Anytime point based approximations for interactive POMDPs. [Internet] [Thesis]. University of Georgia; 2014. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/10724/24464.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Perez Barrenechea DD. Anytime point based approximations for interactive POMDPs. [Thesis]. University of Georgia; 2014. Available from: http://hdl.handle.net/10724/24464
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Texas – Austin
8.
-7840-2726.
Decision analysis perspectives on sequential testing.
Degree: MSin Engineering, Operations Research and Industrial Engineering, 2020, University of Texas – Austin
URL: http://dx.doi.org/10.26153/tsw/7495
► Expanding from proof-load testing, this paper utilizes a decision analysis framework to determine adequate bounds for how much one should be willing to pay to…
(more)
▼ Expanding from proof-load testing, this paper utilizes a
decision analysis framework to determine adequate bounds for how much one should be willing to
pay to conduct testing when the load distribution of a device is uncertain. Optimal testing policy strategies are constructed and evaluated when sequentially testing a population of devices for given prior beliefs on the intrinsic toughness of the device. Heuristic extensions are analyzed to reduce computational time for practical use.
Advisors/Committee Members: Bickel, J. Eric (advisor), Hasenbein, John (committee member).
Subjects/Keywords: Sequential testing; Decision analysis; Bayesian updating; Markov decision process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
-7840-2726. (2020). Decision analysis perspectives on sequential testing. (Masters Thesis). University of Texas – Austin. Retrieved from http://dx.doi.org/10.26153/tsw/7495
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Chicago Manual of Style (16th Edition):
-7840-2726. “Decision analysis perspectives on sequential testing.” 2020. Masters Thesis, University of Texas – Austin. Accessed April 11, 2021.
http://dx.doi.org/10.26153/tsw/7495.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
MLA Handbook (7th Edition):
-7840-2726. “Decision analysis perspectives on sequential testing.” 2020. Web. 11 Apr 2021.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Vancouver:
-7840-2726. Decision analysis perspectives on sequential testing. [Internet] [Masters thesis]. University of Texas – Austin; 2020. [cited 2021 Apr 11].
Available from: http://dx.doi.org/10.26153/tsw/7495.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Council of Science Editors:
-7840-2726. Decision analysis perspectives on sequential testing. [Masters Thesis]. University of Texas – Austin; 2020. Available from: http://dx.doi.org/10.26153/tsw/7495
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete

University of Newcastle
9.
Abed-Alguni, Bilal Hashem Kalil.
Cooperative reinforcement learning for independent learners.
Degree: PhD, 2014, University of Newcastle
URL: http://hdl.handle.net/1959.13/1052917
► Research Doctorate - Doctor of Philosophy (PhD)
Machine learning in multi-agent domains poses several research challenges. One challenge is how to model cooperation between reinforcement…
(more)
▼ Research Doctorate - Doctor of Philosophy (PhD)
Machine learning in multi-agent domains poses several research challenges. One challenge is how to model cooperation between reinforcement learners. Cooperation between independent reinforcement learners is known to accelerate convergence to optimal solutions. In large state space problems, independent reinforcement learners normally cooperate to accelerate the learning process using decomposition techniques or knowledge sharing strategies. This thesis presents two techniques to multi-agent reinforcement learning and a comparison study. The first technique is a formal decomposition model and an algorithm for distributed systems. The second technique is a cooperative Q-learning algorithm for multi-goal decomposable systems. The comparison study compares the performance of some of the best known cooperative Q-learning algorithms for independent learners. Distributed systems are normally organised into two levels: system and subsystem levels. This thesis presents a formal solution for decomposition of Markov Decision Processes (MDPs) in distributed systems that takes advantage of the organisation of distributed systems and provides support for migration of learners. This is accomplished by two proposals: a Distributed, Hierarchical Learning Model (DHLM) and an Intelligent Distributed Q-Learning algorithm (IDQL) that are based on three specialisations of agents: workers, tutors and consultants. Worker agents are the actual learners and performers of tasks, while tutor agents and consultant agents are coordinators at the subsystem level and the system level, respectively. A main duty of consultant and tutor agents is the assignment of problem space to worker agents. The experimental results in a distributed hunter prey problem suggest that IDQL converges to a solution faster than the single agent Q-learning approach. An important feature of DHLM is that it provides a solution for migration of agents. This feature provides support for the IDQL algorithm where the problem space of each worker agent can change dynamically. Other hierarchical RL models do not cover this issue. Problems that have multiple goal-states can be decomposed into sub-problems by taking advantage of the loosely-coupled bonds among the goal states. In such problems, each goal state and its problem space form a sub-problem. This thesis introduces Q-learning with Aggregation algorithm (QA-learning), an algorithm for problems with multiple goal-states that is based on two roles: learner and tutor. A learner is an agent that learns and uses the knowledge of its neighbours (tutors) to construct its Q-table. A tutor is a learner that is ready to share its Q-table with its neighbours (learners). These roles are based on the concept of learners reusing tutors' sub-solutions. This algorithm provides solutions to problems with multiple goal-states. In this algorithm, each learner incorporates its tutors' knowledge into its own Q-table calculations. A comprehensive solution can then be obtained by combining these…
Advisors/Committee Members: University of Newcastle. Faculty of Engineering & Built Environment, School of Electrical Engineering and Computer Science.
Subjects/Keywords: reinforcement learning; Q-learning; multi-agent system; distributed system; Markov decision process; factored Markov decision process; cooperation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Abed-Alguni, B. H. K. (2014). Cooperative reinforcement learning for independent learners. (Doctoral Dissertation). University of Newcastle. Retrieved from http://hdl.handle.net/1959.13/1052917
Chicago Manual of Style (16th Edition):
Abed-Alguni, Bilal Hashem Kalil. “Cooperative reinforcement learning for independent learners.” 2014. Doctoral Dissertation, University of Newcastle. Accessed April 11, 2021.
http://hdl.handle.net/1959.13/1052917.
MLA Handbook (7th Edition):
Abed-Alguni, Bilal Hashem Kalil. “Cooperative reinforcement learning for independent learners.” 2014. Web. 11 Apr 2021.
Vancouver:
Abed-Alguni BHK. Cooperative reinforcement learning for independent learners. [Internet] [Doctoral dissertation]. University of Newcastle; 2014. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/1959.13/1052917.
Council of Science Editors:
Abed-Alguni BHK. Cooperative reinforcement learning for independent learners. [Doctoral Dissertation]. University of Newcastle; 2014. Available from: http://hdl.handle.net/1959.13/1052917

Queensland University of Technology
10.
Al Sabban, Wesam H.
Autonomous vehicle path planning for persistence monitoring under uncertainty using Gaussian based Markov decision process.
Degree: 2015, Queensland University of Technology
URL: https://eprints.qut.edu.au/82297/
► One of the main challenges facing online and offline path planners is the uncertainty in the magnitude and direction of the environmental energy because it…
(more)
▼ One of the main challenges facing online and offline path planners is the uncertainty in the magnitude and direction of the environmental energy because it is dynamic, changeable with time, and hard to forecast. This thesis develops an artificial intelligence for a mobile robot to learn from historical or forecasted data of environmental energy available in the area of interest which will help for a persistence monitoring under uncertainty using the developed algorithm.
Subjects/Keywords: Autonomous Vehicle Path Planning; Path Planning Under Uncertainty; Markov Decision Process; Gaussian Based Markov Decision Process; GMDP; UAV; UAS; AUV
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Al Sabban, W. H. (2015). Autonomous vehicle path planning for persistence monitoring under uncertainty using Gaussian based Markov decision process. (Thesis). Queensland University of Technology. Retrieved from https://eprints.qut.edu.au/82297/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Al Sabban, Wesam H. “Autonomous vehicle path planning for persistence monitoring under uncertainty using Gaussian based Markov decision process.” 2015. Thesis, Queensland University of Technology. Accessed April 11, 2021.
https://eprints.qut.edu.au/82297/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Al Sabban, Wesam H. “Autonomous vehicle path planning for persistence monitoring under uncertainty using Gaussian based Markov decision process.” 2015. Web. 11 Apr 2021.
Vancouver:
Al Sabban WH. Autonomous vehicle path planning for persistence monitoring under uncertainty using Gaussian based Markov decision process. [Internet] [Thesis]. Queensland University of Technology; 2015. [cited 2021 Apr 11].
Available from: https://eprints.qut.edu.au/82297/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Al Sabban WH. Autonomous vehicle path planning for persistence monitoring under uncertainty using Gaussian based Markov decision process. [Thesis]. Queensland University of Technology; 2015. Available from: https://eprints.qut.edu.au/82297/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Georgia
11.
Trivedi, Maulesh.
Inverse learning of robot behavior for ad-hoc teamwork.
Degree: 2017, University of Georgia
URL: http://hdl.handle.net/10724/36912
► Machine Learning and Robotics present a very intriguing combination of research in Artificial Intelligence. Inverse Reinforcement Learning (IRL) algorithms have generated a great deal of…
(more)
▼ Machine Learning and Robotics present a very intriguing combination of research in Artificial Intelligence. Inverse Reinforcement Learning (IRL) algorithms have generated a great deal of interest in the AI community in recent years. However,
very little research has been done on modelling agent interactions in multi-robot ad-hoc settings after learning is complete. Moreover, incorporating IRL for practical robot environments that deal with online learning and high levels of uncertainty is a
challenge. While decision theoretic frameworks used for planning in these environments provide good approximations for computing an optimal policy for an agent, these model parameters are usually specified by a human designer. We describe a unique
Bayesian approach to approximate unknown state transition functions. We then propose a novel multi-agent Best Response Model that plugs in the expert’s reward structure learnt through Maximum Entropy Inverse Reinforcement Learning, and use the learnt
transition functions from our Bayes Adaptive approach to compute an optimal best response policy for our multi-robot ad-hoc setting. We test our algorithms on a robot debris-sorting task.
Subjects/Keywords: Inverse Reinforcement Learning; Markov Decision Process; Bayes Adaptive Markov Decision Process; Best Response Model; Dec MDP; Optimal Policy; Reward Function
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Trivedi, M. (2017). Inverse learning of robot behavior for ad-hoc teamwork. (Thesis). University of Georgia. Retrieved from http://hdl.handle.net/10724/36912
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Trivedi, Maulesh. “Inverse learning of robot behavior for ad-hoc teamwork.” 2017. Thesis, University of Georgia. Accessed April 11, 2021.
http://hdl.handle.net/10724/36912.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Trivedi, Maulesh. “Inverse learning of robot behavior for ad-hoc teamwork.” 2017. Web. 11 Apr 2021.
Vancouver:
Trivedi M. Inverse learning of robot behavior for ad-hoc teamwork. [Internet] [Thesis]. University of Georgia; 2017. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/10724/36912.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Trivedi M. Inverse learning of robot behavior for ad-hoc teamwork. [Thesis]. University of Georgia; 2017. Available from: http://hdl.handle.net/10724/36912
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Illinois – Urbana-Champaign
12.
Baharian Khoshkhou, Golshid.
Stochastic sequential assignment problem.
Degree: PhD, 0127, 2014, University of Illinois – Urbana-Champaign
URL: http://hdl.handle.net/2142/50503
► The stochastic sequential assignment problem (SSAP) studies the allocation of available distinct workers with deterministic values to sequentially-arriving tasks with stochastic parameters so as to…
(more)
▼ The stochastic sequential assignment problem (SSAP) studies the allocation of available distinct workers with deterministic values to sequentially-arriving tasks with stochastic parameters so as to maximize the expected total reward obtained from the assignments. The difficulty and challenge in making the assignment decisions is that the assignments are performed in real-time; specifically, pairing a worker with a task is done without knowledge of future task values. This thesis focuses on studying practical variations and extensions of the SSAP, with the goal of eliminating restricting assumptions so that the problem setting converges to that of real-world problems.
The existing SSAP literature considers a risk-neutral objective function, seeking an assignment policy to maximize the expected total reward; however, a risk-neutral objective function is not always desirable for the
decision-maker since the probability distribution function (pdf) of the total reward might carry a high probability of low values. To take this issue into account, the first part of this dissertation studies the SSAP under a risk-sensitive objective function. Specifically, the assignments are performed so as to minimize the threshold probability, which is the probability of the total reward failing to achieve a specified target (threshold). A target-dependent
Markov decision process (MDP) is solved, and sufficient conditions for the existence of a deterministic
Markov optimal policy are provided. An approximate algorithm is presented, and convergence of the approximate value function to the optimal value function is established under mild conditions.
The second part of this thesis analyzes the limiting behavior of the SSAP as the number of assignments approaches infinity. The optimal assignment policy for the basic SSAP has a threshold structure and involves computing a new set of breakpoints upon the arrival of each task, which is cumbersome for large-scale problems. To address this issue, the second part of this dissertation focuses on obtaining stationary (time-independent) optimal assignment policies that maximize the long-run expected reward per task and are much easier to perform in real-world problems. An exponential convergence rate is established for the convergence of the expected total reward per task to the optimal value as the number of tasks approaches infinity. The limiting behavior of the SSAP is studied in two different settings. The first setting assumes an independent and identically distributed (IID) sequence of arriving tasks with observable distribution functions, while the second problem considers the case where task distributions are unobservable and belong to a pool of feasible distributions.
The next part of this dissertation basically brings the first two parts together, studying the limiting behavior of the target-dependent SSAP, where the goal is finding an assignment policy that minimizes the probability of the long-run reward per task failing to achieve a given target value. It is proven that the…
Advisors/Committee Members: Jacobson, Sheldon H. (advisor), Chen, Xin (Committee Chair), Jacobson, Sheldon H. (committee member), Kiyavash, Negar (committee member), Shanbhag, Vinayak V. (committee member).
Subjects/Keywords: sequential assignment; Markov decision process; stationary policy; hidden Markov model; threshold criteria; risk measure
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Baharian Khoshkhou, G. (2014). Stochastic sequential assignment problem. (Doctoral Dissertation). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/50503
Chicago Manual of Style (16th Edition):
Baharian Khoshkhou, Golshid. “Stochastic sequential assignment problem.” 2014. Doctoral Dissertation, University of Illinois – Urbana-Champaign. Accessed April 11, 2021.
http://hdl.handle.net/2142/50503.
MLA Handbook (7th Edition):
Baharian Khoshkhou, Golshid. “Stochastic sequential assignment problem.” 2014. Web. 11 Apr 2021.
Vancouver:
Baharian Khoshkhou G. Stochastic sequential assignment problem. [Internet] [Doctoral dissertation]. University of Illinois – Urbana-Champaign; 2014. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/2142/50503.
Council of Science Editors:
Baharian Khoshkhou G. Stochastic sequential assignment problem. [Doctoral Dissertation]. University of Illinois – Urbana-Champaign; 2014. Available from: http://hdl.handle.net/2142/50503

Cornell University
13.
Feng, Jiekun.
Markov chain, Markov decision process, and deep reinforcement learning with applications to hospital management and real-time ride-hailing.
Degree: PhD, Statistics, 2020, Cornell University
URL: http://hdl.handle.net/1813/102998
► This dissertation summarizes the research in the area of Markov chains, Markov decision processes, and deep reinforcement learning. Motivated from customer queues in a call…
(more)
▼ This dissertation summarizes the research in the area of
Markov chains,
Markov decision processes, and deep reinforcement learning. Motivated from customer queues in a call center or a hospital inpatient ward, the stationary distributions of continuous- and discrete-time
Markov chains often have important implications. As the exact forms of these stationary distributions may not exist and furthermore may be time-consuming to evaluate, closed-form approximations of high quality are desirable. Utilizing the Stein's method, we provides a framework to 1) construct an appropriate approximator to the stationary distribution of a
Markov chain; and 2) establish an upper bound on the approximation error that is pretty sharp. Extending
Markov chains to
Markov decision processes, we develop a framework to model a common yet challenging problem posed to the real-time ride-hailing services or platforms such as Uber: In anticipation of future demand/supply conditions, how to carry out car-passenger matching and empty-car routing so as to optimize certain objective over a time horizon of interest – typically several hours to a day. In face of the immense state and action space entailed by the complicated time-varying environment of the ride-hailing service, we 1) decompose the
decision at every
decision time point into a sequence of instantaneous actions, each induced by a single car or passenger; and 2) develop an algorithm combining the state-of-the-art deep learning and reinforcement learning (deep reinforcement learning) techniques. While recent years have witnessed the prevalence of deep reinforcement learning research with real-world impacts in a number of applications – ride-hailing among them, our work is unique in that it seeks to 1) optimize the long-term utility from the perspective of the system instead of the individual cars, in particular, we use global demand and supply conditions as our features, which do not concentrate upon any single agent (car candidate in our case); in fact, standing at any (instantaneous)
decision time point, we do not know which car candidate to
process (as opposed to following the conventional first-in-first-out rule), and let the features make the selection; 2) learn the optimal policy directly through the reinforcement learning framework, for this purpose, our algorithm includes both policy evaluation and policy improvement steps in multiple policy iterations. Numerical study based on real-world data from Didi Chuxing demonstrates the efficacy of our methodology: It rapidly and stably improves the fraction of ride requests fulfilled to a level comparable to the state-of-the-art result, and after 73 policy iterations, surpasses it by 3%.
Advisors/Committee Members: Dai, Jim (chair), Wells, Martin Timothy (committee member), Iyer, Krishnamurthy (committee member).
Subjects/Keywords: deep learning; Markov chain; Markov decision process; Proximal Policy Optimization; reinforcement learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Feng, J. (2020). Markov chain, Markov decision process, and deep reinforcement learning with applications to hospital management and real-time ride-hailing. (Doctoral Dissertation). Cornell University. Retrieved from http://hdl.handle.net/1813/102998
Chicago Manual of Style (16th Edition):
Feng, Jiekun. “Markov chain, Markov decision process, and deep reinforcement learning with applications to hospital management and real-time ride-hailing.” 2020. Doctoral Dissertation, Cornell University. Accessed April 11, 2021.
http://hdl.handle.net/1813/102998.
MLA Handbook (7th Edition):
Feng, Jiekun. “Markov chain, Markov decision process, and deep reinforcement learning with applications to hospital management and real-time ride-hailing.” 2020. Web. 11 Apr 2021.
Vancouver:
Feng J. Markov chain, Markov decision process, and deep reinforcement learning with applications to hospital management and real-time ride-hailing. [Internet] [Doctoral dissertation]. Cornell University; 2020. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/1813/102998.
Council of Science Editors:
Feng J. Markov chain, Markov decision process, and deep reinforcement learning with applications to hospital management and real-time ride-hailing. [Doctoral Dissertation]. Cornell University; 2020. Available from: http://hdl.handle.net/1813/102998

Cornell University
14.
Kumar, Ravi.
Dynamic Resource Management For Systems With Controllable Service Capacity.
Degree: PhD, Operations Research, 2015, Cornell University
URL: http://hdl.handle.net/1813/41011
► The rise in the Internet traffic volumes has led to a growing interest in reducing energy costs of IT infrastructure. Resource management policies for such…
(more)
▼ The rise in the Internet traffic volumes has led to a growing interest in reducing energy costs of IT infrastructure. Resource management policies for such systems, known as power aware policies, are becoming increasingly important. We propose a dynamic capacity management framework for such systems to save energy costs without sacrificing quality of service. The system incurs two types of costs; a holding cost which reflects the Quality of Service (QoS) experienced by the users, and a cost of effort (Utilization/Energy cost) based on the amount of resources being used. The key challenge for the service provider is balancing these two objectives, as using more resources improves the system performance but also leads to higher utilization costs. This tension between delivering good performance and reducing energy costs is the central feature of the models considered in this work. In order to make good capacity allocation decisions for such scenarios, we formulate two queueing control problems using the
Markov Decision Process framework. We first consider the problem of service rate control of a single-server queueing system with a finite-state
Markov-modulated Poisson arrival
process. We show that the optimal service rate is non-decreasing in the number of jobs in the system; higher congestion levels warrant higher service rates. On the contrary, however, we show that the optimal service rate is not necessarily monotone in the current arrival rate. If the modulating
process satisfies a stochastic mono- tonicity property, the monotonicity is recovered. We examine several heuristics and show where heuristics are reasonable substitutes for the optimal control. In the next model we consider the problem of dynamic resource allocation in the presence of different job classes. In this case, the service provider has to make two decisions; how many resources to utilize and how to allocate these resources among various job classes. These decisions, made dynamically based on the congestion levels of job classes, minimize a weighted sum of utilization and QoS related costs. We show that a priority policy is optimal for scheduling the jobs of different classes. We further show that the optimal resource utilization levels are non-decreasing in the number of jobs of each type. We extend this model to the case when server reallocations incur a switching penalty. In this case, we show that the optimal policy is hard to characterize and computing the optimal solution using the Dynamic Programming framework becomes intractable. Instead, we develop several heuristics and perform numerical studies to compare their performance.
Advisors/Committee Members: Lewis,Mark E. (chair), Henderson,Shane G. (committee member), Topaloglu,Huseyin (committee member).
Subjects/Keywords: Markov Decision Process; Service Rate Control; Dynamic Power Management
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Kumar, R. (2015). Dynamic Resource Management For Systems With Controllable Service Capacity. (Doctoral Dissertation). Cornell University. Retrieved from http://hdl.handle.net/1813/41011
Chicago Manual of Style (16th Edition):
Kumar, Ravi. “Dynamic Resource Management For Systems With Controllable Service Capacity.” 2015. Doctoral Dissertation, Cornell University. Accessed April 11, 2021.
http://hdl.handle.net/1813/41011.
MLA Handbook (7th Edition):
Kumar, Ravi. “Dynamic Resource Management For Systems With Controllable Service Capacity.” 2015. Web. 11 Apr 2021.
Vancouver:
Kumar R. Dynamic Resource Management For Systems With Controllable Service Capacity. [Internet] [Doctoral dissertation]. Cornell University; 2015. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/1813/41011.
Council of Science Editors:
Kumar R. Dynamic Resource Management For Systems With Controllable Service Capacity. [Doctoral Dissertation]. Cornell University; 2015. Available from: http://hdl.handle.net/1813/41011

Penn State University
15.
Hu, Nan.
Stochastic Resource Allocation Strategies With Uncertain Information In Sensor Networks.
Degree: 2016, Penn State University
URL: https://submit-etda.libraries.psu.edu/catalog/13593nqh5045
► Support for intelligent and autonomous resource management is one of the key factors to the success of modern sensor network systems. The limited resources, such…
(more)
▼ Support for intelligent and autonomous resource management is one of the key factors to the success of modern sensor network systems. The limited resources, such as exhaustible battery life, moderate processing ability and finite bandwidth, restrict the system’s ability to simultaneously accommodate all missions that are submitted by users. In order to achieve the optimal profit in such dynamic conditions, the value of each mission, quantified by its demand on resources and achievable profit, need to be properly evaluated in different situations.
In practice, uncertainties may exist in the entire execution of a mission, thus should not be ignored. For a single mission, uncertainty, such as unreliable wireless medium and variable quality of sensor outputs, both demands and profits of the mission may not be deterministic and may be hard to predict precisely. Moreover,
throughout the
process of execution, each mission may experience multiple states, the transitions between which may be affected by different conditions. Even if the current state of a mission is identified, because multiple potential transitions may occur each leading to different consequences, the subsequent state cannot be confirmed until the transition actually occurs. In systems with multiple missions, each with uncertainties, a more complicated circumstance arises, in which the strategy for resource allocation among missions needs to be modified adaptively and dynamically based on both the present status and potential evolution of all missions.
In our research, we take into account several levels of uncertainties that may be faced when allocating limited resources in dynamic environments as described above, where the concepts of missions that require resources may be matched to those as in certain network applications. Our algorithms calculate resource allocation solutions to corresponding scenarios and always aim to achieve high profit, as well as other performance improvements (e.g., resource utilization rate, mission preemption rate, etc.).
Given a fixed set of missions, we consider both demands and profits as random variables, whose values follow certain distributions and may change over time. Since the profit is not constant, rather than achieving a specific maximized profit, our objective is to select the optimal set of missions so as to maximize a certain percentile of their combined profit, while constraining the probability of resource capacity violation within an acceptable threshold. Note that, in this scenario, the selection of missions is final and will not change after the
decision has been made. Therefore, this static solution only fits in the applications with long-term running missions.
For the scenarios with both long-term and short-term missions, to increase the total achieved profit, instead of selecting a fixed mission set, we propose a dynamic strategy which tunes mission selections adaptively to the changing environments. We take a surveillance application as an example, where missions are targeting
specific sets of events,…
Advisors/Committee Members: Thomas F Laporta, Dissertation Advisor/Co-Advisor, Thomas F Laporta, Committee Chair/Co-Chair, Patrick Drew Mcdaniel, Committee Member, Sencun Zhu, Committee Member, Costas D Maranas, Outside Member.
Subjects/Keywords: stochastic resource allocation; markov decision process; uncertainty; sensor network
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Hu, N. (2016). Stochastic Resource Allocation Strategies With Uncertain Information In Sensor Networks. (Thesis). Penn State University. Retrieved from https://submit-etda.libraries.psu.edu/catalog/13593nqh5045
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Hu, Nan. “Stochastic Resource Allocation Strategies With Uncertain Information In Sensor Networks.” 2016. Thesis, Penn State University. Accessed April 11, 2021.
https://submit-etda.libraries.psu.edu/catalog/13593nqh5045.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Hu, Nan. “Stochastic Resource Allocation Strategies With Uncertain Information In Sensor Networks.” 2016. Web. 11 Apr 2021.
Vancouver:
Hu N. Stochastic Resource Allocation Strategies With Uncertain Information In Sensor Networks. [Internet] [Thesis]. Penn State University; 2016. [cited 2021 Apr 11].
Available from: https://submit-etda.libraries.psu.edu/catalog/13593nqh5045.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Hu N. Stochastic Resource Allocation Strategies With Uncertain Information In Sensor Networks. [Thesis]. Penn State University; 2016. Available from: https://submit-etda.libraries.psu.edu/catalog/13593nqh5045
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Queensland University of Technology
16.
Glover, Arren John.
Developing grounded representations for robots through the principles of sensorimotor coordination.
Degree: 2014, Queensland University of Technology
URL: https://eprints.qut.edu.au/71763/
► Robots currently recognise and use objects through algorithms that are hand-coded or specifically trained. Such robots can operate in known, structured environments but cannot learn…
(more)
▼ Robots currently recognise and use objects through algorithms that are hand-coded or specifically trained. Such robots can operate in known, structured environments but cannot learn to recognise or use novel objects as they appear. This thesis demonstrates that a robot can develop meaningful object representations by learning the fundamental relationship between action and change in sensory state; the robot learns sensorimotor coordination. Methods based on Markov Decision Processes are experimentally validated on a mobile robot capable of gripping objects, and it is found that object recognition and manipulation can be learnt as an emergent property of sensorimotor coordination.
Subjects/Keywords: Robotics; Affordance; Visual Object Recognition; Symbol Grounding; Markov Decision Process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Glover, A. J. (2014). Developing grounded representations for robots through the principles of sensorimotor coordination. (Thesis). Queensland University of Technology. Retrieved from https://eprints.qut.edu.au/71763/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Glover, Arren John. “Developing grounded representations for robots through the principles of sensorimotor coordination.” 2014. Thesis, Queensland University of Technology. Accessed April 11, 2021.
https://eprints.qut.edu.au/71763/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Glover, Arren John. “Developing grounded representations for robots through the principles of sensorimotor coordination.” 2014. Web. 11 Apr 2021.
Vancouver:
Glover AJ. Developing grounded representations for robots through the principles of sensorimotor coordination. [Internet] [Thesis]. Queensland University of Technology; 2014. [cited 2021 Apr 11].
Available from: https://eprints.qut.edu.au/71763/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Glover AJ. Developing grounded representations for robots through the principles of sensorimotor coordination. [Thesis]. Queensland University of Technology; 2014. Available from: https://eprints.qut.edu.au/71763/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Australian National University
17.
Karim, Mohammad Shahedul.
Instantly Decodable Network Coding: From Point to Multi-Point to Device-to-Device Communications
.
Degree: 2016, Australian National University
URL: http://hdl.handle.net/1885/118239
► The network coding paradigm enhances transmission efficiency by combining information flows and has drawn significant attention in information theory, networking, communications and data storage. Instantly…
(more)
▼ The network coding paradigm enhances transmission efficiency by
combining information
flows and has drawn significant attention in information theory,
networking, communications
and data storage. Instantly decodable network coding (IDNC), a
subclass of network coding,
has demonstrated its ability to improve the quality of service of
time critical applications
thanks to its attractive properties, namely the throughput
enhancement, delay reduction,
simple XOR-based encoding and decoding, and small coefficient
overhead. Nonetheless, for
point to multi-point (PMP) networks, IDNC cannot guarantee the
decoding of a specific new
packet at individual devices in each transmission. Furthermore,
for device-to-device (D2D)
networks, the transmitting devices may possess only a subset of
packets, which can be used
to form coded packets. These challenges require the optimization
of IDNC algorithms to be
suitable for different application requirements and network
configurations.
In this thesis, we first study a scalable live video broadcast
over a wireless PMP network,
where the devices receive video packets from a base station. Such
layered live video has a
hard deadline and imposes a decoding order on the video layers.
We design two prioritized
IDNC algorithms that provide a high level of priority to the most
important video layer
before considering additional video layers in coding decisions.
These prioritized algorithms
are shown to increase the number of decoded video layers at the
devices compared to the
existing network coding schemes.
We then study video distribution over a partially connected D2D
network, where a group
of devices cooperate with each other to recover their missing
video content. We introduce
a cooperation aware IDNC graph that defines all feasible coding
and transmission conflictfree
decisions. Using this graph, we propose an IDNC solution that
avoids coding and
transmission conflicts, and meets the hard deadline for high
importance video packets. It is
demonstrated that the proposed solution delivers an improved
video quality to the devices
compared to the video and cooperation oblivious coding schemes.
We also consider a heterogeneous network wherein devices use two
wireless interfaces to
receive packets from the base station and another device
concurrently. For such network,
we are interested in applications with reliable in-order packet
delivery requirements. We
represent all feasible coding opportunities and conflict-free
transmissions using a dual interface
IDNC graph. We select a maximal independent set over the graph by
considering dual
interfaces of individual devices, in-order delivery requirements
of packets and lossy channel
conditions. This graph based solution is shown to reduce the…
Subjects/Keywords: Network Coding;
Wireless Communications;
Video Streaming;
Markov Decision Process;
Graph Theory
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Karim, M. S. (2016). Instantly Decodable Network Coding: From Point to Multi-Point to Device-to-Device Communications
. (Thesis). Australian National University. Retrieved from http://hdl.handle.net/1885/118239
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Karim, Mohammad Shahedul. “Instantly Decodable Network Coding: From Point to Multi-Point to Device-to-Device Communications
.” 2016. Thesis, Australian National University. Accessed April 11, 2021.
http://hdl.handle.net/1885/118239.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Karim, Mohammad Shahedul. “Instantly Decodable Network Coding: From Point to Multi-Point to Device-to-Device Communications
.” 2016. Web. 11 Apr 2021.
Vancouver:
Karim MS. Instantly Decodable Network Coding: From Point to Multi-Point to Device-to-Device Communications
. [Internet] [Thesis]. Australian National University; 2016. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/1885/118239.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Karim MS. Instantly Decodable Network Coding: From Point to Multi-Point to Device-to-Device Communications
. [Thesis]. Australian National University; 2016. Available from: http://hdl.handle.net/1885/118239
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
18.
Venkatraman, Pavithra.
Opportunistic bandwidth sharing through reinforcement learning.
Degree: MS, Electrical and Computer Engineering, 2010, Oregon State University
URL: http://hdl.handle.net/1957/19126
► The enormous success of wireless technology has recently led to an explosive demand for, and hence a shortage of, bandwidth resources. This expected shortage problem…
(more)
▼ The enormous success of wireless technology has recently led to an explosive demand for, and hence a shortage of, bandwidth resources. This expected shortage problem is reported to be primarily due to the inefficient, static nature of current spectrum allocation methods. As an initial step towards solving this shortage problem, Federal Communications Commission (FCC) opens up for the so-called opportunistic spectrum access (OSA), which allows unlicensed users to exploit unused licensed spectrum, but in a manner that limits interference to licensed users. Fortunately, technological advances enabled cognitive radios, which are viewed as intelligent communication systems that can self-learn from their surrounding environment, and auto-adapt their internal operating parameters in real-time to improve spectrum efficiency. Cognitive radios have recently been recognized as the key enabling technology for realizing OSA. In this work, we propose a machine learning based scheme that exploits the cognitive radios’ capabilities to enable effective OSA, thus improving the efficiency of spectrum utilization. Specifically, we formulate the OSA problem as a finite
Markov Decision Process (MDP), and use reinforcement learning (RL) to locate and exploit bandwidth opportunities effectively. Simulation results show that our scheme achieves high throughput performance without requiring any prior knowledge of the environment’s characteristics and dynamics.
Advisors/Committee Members: Hamdaoui, Bechir (advisor), Liu, Huaping (committee member).
Subjects/Keywords: Markov decision process
…8
3.1. Markov Decision Process (MDP)… …propose an RL scheme as a possible solution.
3.1.
Markov Decision Process (MDP)
We… …spectrum occupancy of PUs also follows a discrete-time
ON/OFF Markov process.
In most of these… …typically formalized in the context of Markov Decision Processes (MDPs).
An MDP… …their observations to a decision center, which makes the decision regarding when
and which…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Venkatraman, P. (2010). Opportunistic bandwidth sharing through reinforcement learning. (Masters Thesis). Oregon State University. Retrieved from http://hdl.handle.net/1957/19126
Chicago Manual of Style (16th Edition):
Venkatraman, Pavithra. “Opportunistic bandwidth sharing through reinforcement learning.” 2010. Masters Thesis, Oregon State University. Accessed April 11, 2021.
http://hdl.handle.net/1957/19126.
MLA Handbook (7th Edition):
Venkatraman, Pavithra. “Opportunistic bandwidth sharing through reinforcement learning.” 2010. Web. 11 Apr 2021.
Vancouver:
Venkatraman P. Opportunistic bandwidth sharing through reinforcement learning. [Internet] [Masters thesis]. Oregon State University; 2010. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/1957/19126.
Council of Science Editors:
Venkatraman P. Opportunistic bandwidth sharing through reinforcement learning. [Masters Thesis]. Oregon State University; 2010. Available from: http://hdl.handle.net/1957/19126

University of Ottawa
19.
Astaraky, Davood.
A Simulation Based Approximate Dynamic Programming Approach to Multi-class, Multi-resource Surgical Scheduling
.
Degree: 2013, University of Ottawa
URL: http://hdl.handle.net/10393/23622
► The thesis focuses on a model that seeks to address patient scheduling step of the surgical scheduling process to determine the number of surgeries to…
(more)
▼ The thesis focuses on a model that seeks to address patient scheduling step of the surgical scheduling process to determine the number of surgeries to perform in a given day. Specifically, provided a master schedule that provides a cyclic breakdown of total OR availability into specific daily allocations to each surgical specialty, we look to provide a scheduling policy for all surgeries that minimizes a combination of the lead time between patient request and surgery date, overtime in the ORs and congestion in the wards. We cast the problem of generating optimal control strategies into the framework of Markov Decision Process (MDP). The Approximate Dynamic Programming (ADP) approach has been employed to solving the model which would otherwise be intractable due to the size of the state space. We assess performance of resulting policy and quality of the driven policy through simulation and we provide our policy insights and conclusions.
Subjects/Keywords: Approximate Dynamic Programming;
Surgical Scheduling;
Markov Decision Process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Astaraky, D. (2013). A Simulation Based Approximate Dynamic Programming Approach to Multi-class, Multi-resource Surgical Scheduling
. (Thesis). University of Ottawa. Retrieved from http://hdl.handle.net/10393/23622
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Astaraky, Davood. “A Simulation Based Approximate Dynamic Programming Approach to Multi-class, Multi-resource Surgical Scheduling
.” 2013. Thesis, University of Ottawa. Accessed April 11, 2021.
http://hdl.handle.net/10393/23622.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Astaraky, Davood. “A Simulation Based Approximate Dynamic Programming Approach to Multi-class, Multi-resource Surgical Scheduling
.” 2013. Web. 11 Apr 2021.
Vancouver:
Astaraky D. A Simulation Based Approximate Dynamic Programming Approach to Multi-class, Multi-resource Surgical Scheduling
. [Internet] [Thesis]. University of Ottawa; 2013. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/10393/23622.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Astaraky D. A Simulation Based Approximate Dynamic Programming Approach to Multi-class, Multi-resource Surgical Scheduling
. [Thesis]. University of Ottawa; 2013. Available from: http://hdl.handle.net/10393/23622
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
20.
Denis, Nicholas.
On Hierarchical Goal Based Reinforcement Learning
.
Degree: 2019, University of Ottawa
URL: http://hdl.handle.net/10393/39552
► Discrete time sequential decision processes require that an agent select an action at each time step. As humans, we plan over long time horizons and…
(more)
▼ Discrete time sequential decision processes require that an agent select an action
at each time step. As humans, we plan over long time horizons and use temporal
abstraction by selecting temporally extended actions such as “make lunch” or “get
a masters degree”, whereby each is comprised of more granular actions. This thesis
concerns itself with such hierarchical temporal abstractions in the form of macro
actions and options, as they apply to goal-based Markov Decision Processes. A novel
algorithm for discovering hierarchical macro actions in goal-based MDPs, as well as
a novel algorithm utilizing landmark options for transfer learning in multi-task goal-
based reinforcement learning settings are introduced. Theoretical properties regarding the life-long regret of an agent executing the latter algorithm are also discussed.
Subjects/Keywords: Markov decision process;
Reinforcement learning;
Options framework;
Temporal abstraction;
Macro actions
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Denis, N. (2019). On Hierarchical Goal Based Reinforcement Learning
. (Thesis). University of Ottawa. Retrieved from http://hdl.handle.net/10393/39552
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Denis, Nicholas. “On Hierarchical Goal Based Reinforcement Learning
.” 2019. Thesis, University of Ottawa. Accessed April 11, 2021.
http://hdl.handle.net/10393/39552.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Denis, Nicholas. “On Hierarchical Goal Based Reinforcement Learning
.” 2019. Web. 11 Apr 2021.
Vancouver:
Denis N. On Hierarchical Goal Based Reinforcement Learning
. [Internet] [Thesis]. University of Ottawa; 2019. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/10393/39552.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Denis N. On Hierarchical Goal Based Reinforcement Learning
. [Thesis]. University of Ottawa; 2019. Available from: http://hdl.handle.net/10393/39552
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Delft University of Technology
21.
Walraven, E.M.P. (author).
Traffic Flow Optimization using Reinforcement Learning.
Degree: 2014, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:67d499a4-4398-416f-bb51-372bcaa25ac1
► Traffic congestion causes unnecessary delay, pollution and increased fuel consumption. In this thesis we address this problem by proposing new algorithmic techniques to reduce traffic…
(more)
▼ Traffic congestion causes unnecessary delay, pollution and increased fuel consumption. In this thesis we address this problem by proposing new algorithmic techniques to reduce traffic congestion and we contribute to the development of a new Intelligent Transportation System. We present a method to determine speed limits, in which we combine a traffic flow model with reinforcement learning techniques. A traffic flow optimization problem is formulated as a Markov Decision Process, and subsequently solved using Q-learning enhanced with value function approximation. This results in a single-agent and multi-agent approach to assign speed limits to highway sections. A difference between our work and existing approaches is that we also take traffic predictions into account. The performance of our method is evaluated in macroscopic simulations, in which we show that it is able to significantly reduce congestion under high traffic demands. A case study has been performed to evaluate the effectiveness of our method in microscopic simulations. The case study serves as a proof of concept and shows that our method performs well on a real scenario.
Algorithmics
Software and Computer Technology
Electrical Engineering, Mathematics and Computer Science
Advisors/Committee Members: Spaan, M.T.J. (mentor).
Subjects/Keywords: reinforcement learning; markov decision process; traffic flow optimization; speed limits
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Walraven, E. M. P. (. (2014). Traffic Flow Optimization using Reinforcement Learning. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:67d499a4-4398-416f-bb51-372bcaa25ac1
Chicago Manual of Style (16th Edition):
Walraven, E M P (author). “Traffic Flow Optimization using Reinforcement Learning.” 2014. Masters Thesis, Delft University of Technology. Accessed April 11, 2021.
http://resolver.tudelft.nl/uuid:67d499a4-4398-416f-bb51-372bcaa25ac1.
MLA Handbook (7th Edition):
Walraven, E M P (author). “Traffic Flow Optimization using Reinforcement Learning.” 2014. Web. 11 Apr 2021.
Vancouver:
Walraven EMP(. Traffic Flow Optimization using Reinforcement Learning. [Internet] [Masters thesis]. Delft University of Technology; 2014. [cited 2021 Apr 11].
Available from: http://resolver.tudelft.nl/uuid:67d499a4-4398-416f-bb51-372bcaa25ac1.
Council of Science Editors:
Walraven EMP(. Traffic Flow Optimization using Reinforcement Learning. [Masters Thesis]. Delft University of Technology; 2014. Available from: http://resolver.tudelft.nl/uuid:67d499a4-4398-416f-bb51-372bcaa25ac1

University of Toronto
22.
Delesalle, Samuel.
Maintenance and Reliability Models for a Public Transit Bus.
Degree: 2020, University of Toronto
URL: http://hdl.handle.net/1807/103534
► Public transit buses are an essential part of modern society and are prone to frequent breakdowns. The goal of this research is to develop an…
(more)
▼ Public transit buses are an essential part of modern society and are prone to frequent breakdowns. The goal of this research is to develop an opportunistic maintenance policy that maximizes bus availability. A transit bus is composed of over 10 000 components which can be classified into modules based on their location and function in the bus. A reliability model considering imperfect repair is developed using real transit bus failure data to identify the components that have the largest impact on bus availability. A method to find the maintenance policy that maximizes the availability of a bus module is developed using the semi-Markov decision process framework. Discrete event simulation is used to find an opportunistic maintenance policy that maximizes the availability for an entire transit bus. The developed maintenance policies are compared with the conventional maintenance practice of repairing only on failure.
M.A.S.
Advisors/Committee Members: Makis, Viliam, Mechanical and Industrial Engineering.
Subjects/Keywords: Maintenance; Opportunistic; Reliability; Semi-Markov Decision Process; Simulation; Transit Bus; 0546
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Delesalle, S. (2020). Maintenance and Reliability Models for a Public Transit Bus. (Masters Thesis). University of Toronto. Retrieved from http://hdl.handle.net/1807/103534
Chicago Manual of Style (16th Edition):
Delesalle, Samuel. “Maintenance and Reliability Models for a Public Transit Bus.” 2020. Masters Thesis, University of Toronto. Accessed April 11, 2021.
http://hdl.handle.net/1807/103534.
MLA Handbook (7th Edition):
Delesalle, Samuel. “Maintenance and Reliability Models for a Public Transit Bus.” 2020. Web. 11 Apr 2021.
Vancouver:
Delesalle S. Maintenance and Reliability Models for a Public Transit Bus. [Internet] [Masters thesis]. University of Toronto; 2020. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/1807/103534.
Council of Science Editors:
Delesalle S. Maintenance and Reliability Models for a Public Transit Bus. [Masters Thesis]. University of Toronto; 2020. Available from: http://hdl.handle.net/1807/103534

University of Windsor
23.
Islam, Kingshuk Jubaer.
Outsourcing Evaluation in RL Network.
Degree: MA, Industrial and Manufacturing Systems Engineering, 2012, University of Windsor
URL: https://scholar.uwindsor.ca/etd/5347
► This thesis addresses the qualitative investigation of the reverse logistics and outsourcing and a quantitative analysis of reverse logistic networks that covenant with the…
(more)
▼ This thesis addresses the qualitative investigation of the reverse logistics and outsourcing and a quantitative analysis of reverse logistic networks that covenant with the option of outsourcing or in-house remanufacturing. Two models are proposed with an objective of contributing to
decision making
process for reverse logistics outsourcing. The purpose is to find a set of decisions throughout the product life cycle that maximizes both outsourcing and in-house remanufacturing. These models will also verify two hypotheses: outsourcing is more likely to be an optimal solution when variance in return rate is high, and also when the product life cycle is short in length. Then, a solution approach is designed for solving this problem which follows MDP that considers the firm following a dynamic capacity model and also a stationary capacity model. Finally, computational analyses are performed to demonstrate the applicability of the model. Numerical results justify the two hypotheses.
Advisors/Committee Members: Walid Abdul-Kader.
Subjects/Keywords: Markov Decision Process; Outsourcing; Reverse Supply Chain; RL Network
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Islam, K. J. (2012). Outsourcing Evaluation in RL Network. (Masters Thesis). University of Windsor. Retrieved from https://scholar.uwindsor.ca/etd/5347
Chicago Manual of Style (16th Edition):
Islam, Kingshuk Jubaer. “Outsourcing Evaluation in RL Network.” 2012. Masters Thesis, University of Windsor. Accessed April 11, 2021.
https://scholar.uwindsor.ca/etd/5347.
MLA Handbook (7th Edition):
Islam, Kingshuk Jubaer. “Outsourcing Evaluation in RL Network.” 2012. Web. 11 Apr 2021.
Vancouver:
Islam KJ. Outsourcing Evaluation in RL Network. [Internet] [Masters thesis]. University of Windsor; 2012. [cited 2021 Apr 11].
Available from: https://scholar.uwindsor.ca/etd/5347.
Council of Science Editors:
Islam KJ. Outsourcing Evaluation in RL Network. [Masters Thesis]. University of Windsor; 2012. Available from: https://scholar.uwindsor.ca/etd/5347

University of Edinburgh
24.
Ahmad Mustaffa, Nurakmal.
Global dual-sourcing strategy : is it effective in mitigating supply disruption?.
Degree: PhD, 2015, University of Edinburgh
URL: http://hdl.handle.net/1842/21046
► Most firms are still failing to think strategically and systematically about managing supply disruption risk and most of the supply chain management efforts are focused…
(more)
▼ Most firms are still failing to think strategically and systematically about managing supply disruption risk and most of the supply chain management efforts are focused on reducing supply chain operation costs rather than managing disruption. Some innovative firms have taken steps to implement supply chain risk management (SCRM). Inventory management is part of SCRM because supply disruptions negatively affect the reliability of deliveries from suppliers and the costs associated with the ordering process. The complexity of existing inventory models makes it challenging to combine the management of the supply process and inventory in a single model due, for example, to the difficulty of including the characteristics of the disruption process in the supply chain network structure. Therefore, there is a need for a simple flexible model that can incorporate the key elements of supply disruption in an inventory model. This thesis presents a series of models that investigate the importance of information on disruption discovery and recovery for a firm’s supply and inventory management. A simple two-echelon supply chain with one firm and two suppliers (i.e., referred to as the onshore and offshore suppliers) in a single product/component setting has been considered in this thesis for the purpose of experimental analyses. The sourcing decisions that the firm faces during periods of supply disruption are examined leading to an assessment of how information about the risk and length of disruption and recovery can be used to facilitate the firm’s sourcing decisions and monitor the performance of stock control during the disruption. The first part of this thesis analyses basic ordering models (Model 1 and Model 2 respectively) without the risk of supply disruption and with the risk of supply disruption. The second part analyses the value of supply disruption information, using a model with advance information on the length of disruption (Model 3) and a model with learning about the length of disruption (Model 4). The third part explores a quantitative recovery model and the analyses in this part consider of three models. Model 5 assumes a basic phased recovery model, Model 6 assumes advance information about the phased recovery process and Model 7 assumes learning about the phased recovery process. The last part of this thesis investigates the order pressure scenario that exists in the firm’s supply chain. Under this scenario, disruption to one part of the supply chain network increases demand on the remainder resulting in a lower service levels than normal. This scenario is applied to all the previous models apart from Model 1. The models in this thesis are examined under finite and infinite planning horizons and with constant and stochastic demand. The objective of the models is to minimise the expected inventory cost and optimise the order quantity from the suppliers given the different assumptions with respect to the length of supply disruption and information about the recovery process. The models have been developed using…
Subjects/Keywords: 658.7; supply disruption; discrete time Markov decision process; DTMDP
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ahmad Mustaffa, N. (2015). Global dual-sourcing strategy : is it effective in mitigating supply disruption?. (Doctoral Dissertation). University of Edinburgh. Retrieved from http://hdl.handle.net/1842/21046
Chicago Manual of Style (16th Edition):
Ahmad Mustaffa, Nurakmal. “Global dual-sourcing strategy : is it effective in mitigating supply disruption?.” 2015. Doctoral Dissertation, University of Edinburgh. Accessed April 11, 2021.
http://hdl.handle.net/1842/21046.
MLA Handbook (7th Edition):
Ahmad Mustaffa, Nurakmal. “Global dual-sourcing strategy : is it effective in mitigating supply disruption?.” 2015. Web. 11 Apr 2021.
Vancouver:
Ahmad Mustaffa N. Global dual-sourcing strategy : is it effective in mitigating supply disruption?. [Internet] [Doctoral dissertation]. University of Edinburgh; 2015. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/1842/21046.
Council of Science Editors:
Ahmad Mustaffa N. Global dual-sourcing strategy : is it effective in mitigating supply disruption?. [Doctoral Dissertation]. University of Edinburgh; 2015. Available from: http://hdl.handle.net/1842/21046

Queens University
25.
Cownden, Daniel.
Evolutionarily Stable Learning and Foraging Strategies
.
Degree: Mathematics and Statistics, 2012, Queens University
URL: http://hdl.handle.net/1974/6999
► This thesis examines a series of problems with the goal of better understanding the fundamental dilemma of whether to invest effort in obtaining information that…
(more)
▼ This thesis examines a series of problems with the goal of better understanding the fundamental dilemma of whether to invest effort in obtaining information that may lead to better opportunities in the future versus exploiting immediately available opportunities. In particular this work investigates how this dilemma is affected by competition in an evolutionary setting. To achieve this requires both the use of evolutionary game theory, and Markov decision procesess or stochastic dynamic programming. This thesis grows directly out of earlier work on the Social Learning Strategies Tournament. Although I cast the problem in the biological setting of optimal foraging theory, where it fills an obvious gap, this fundamental dilemma should also be of some interest to economists, operations researchers, as well as those working in ecology, evolution and behaviour.
Subjects/Keywords: Evolutionary Game Theory
;
Partially Observable Markov Decision Process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Cownden, D. (2012). Evolutionarily Stable Learning and Foraging Strategies
. (Thesis). Queens University. Retrieved from http://hdl.handle.net/1974/6999
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Cownden, Daniel. “Evolutionarily Stable Learning and Foraging Strategies
.” 2012. Thesis, Queens University. Accessed April 11, 2021.
http://hdl.handle.net/1974/6999.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Cownden, Daniel. “Evolutionarily Stable Learning and Foraging Strategies
.” 2012. Web. 11 Apr 2021.
Vancouver:
Cownden D. Evolutionarily Stable Learning and Foraging Strategies
. [Internet] [Thesis]. Queens University; 2012. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/1974/6999.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Cownden D. Evolutionarily Stable Learning and Foraging Strategies
. [Thesis]. Queens University; 2012. Available from: http://hdl.handle.net/1974/6999
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Louisiana State University
26.
Irshad, Ahmed Syed.
Fuzzifying [sic] Markov decision process.
Degree: MSEE, Electrical and Computer Engineering, 2005, Louisiana State University
URL: etd-04112005-224801
;
https://digitalcommons.lsu.edu/gradschool_theses/1373
► Markov decision processes have become an indispensable tool in applications as diverse as equipment maintenance, manufacturing systems, inventory control, queuing networks and investment analysis. Typically…
(more)
▼ Markov decision processes have become an indispensable tool in applications as diverse as equipment maintenance, manufacturing systems, inventory control, queuing networks and investment analysis. Typically we have a controlled Markov chain on a suitable state space in which transitional probabilities depend on the policy (or decision maker) which comes from a set of possible actions. The main problem of interest would be to find an optimal policy that minimizes the associated cost. Linear Programming has been widely used to find the optimal Markov decision policy. It requires solutions of large systems of simultaneous linear equations. By the fact that the complexity in linear programming increases much faster with the increase in the number of states which is often called curse of dimensionality, the linear programming method can handle only small models. This thesis presents a new method to lessen the curse of dimensionality. By assuming certain monotonicity property for the transition probability, it is shown that a fuzzy membership function can be used to reduce the number of states. The use of membership functions help to reduce the number of the states. However all the states remain intact through the use of the membership value. That is, those states eliminated can be recovered through interpolation with the aid of membership functions. This new proposed method is shown to be effective in coping with the curse of dimensionality.
Subjects/Keywords: markov decision process; fuzzy membership
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Irshad, A. S. (2005). Fuzzifying [sic] Markov decision process. (Masters Thesis). Louisiana State University. Retrieved from etd-04112005-224801 ; https://digitalcommons.lsu.edu/gradschool_theses/1373
Chicago Manual of Style (16th Edition):
Irshad, Ahmed Syed. “Fuzzifying [sic] Markov decision process.” 2005. Masters Thesis, Louisiana State University. Accessed April 11, 2021.
etd-04112005-224801 ; https://digitalcommons.lsu.edu/gradschool_theses/1373.
MLA Handbook (7th Edition):
Irshad, Ahmed Syed. “Fuzzifying [sic] Markov decision process.” 2005. Web. 11 Apr 2021.
Vancouver:
Irshad AS. Fuzzifying [sic] Markov decision process. [Internet] [Masters thesis]. Louisiana State University; 2005. [cited 2021 Apr 11].
Available from: etd-04112005-224801 ; https://digitalcommons.lsu.edu/gradschool_theses/1373.
Council of Science Editors:
Irshad AS. Fuzzifying [sic] Markov decision process. [Masters Thesis]. Louisiana State University; 2005. Available from: etd-04112005-224801 ; https://digitalcommons.lsu.edu/gradschool_theses/1373

University of Georgia
27.
Bogert, Kenneth Daniel.
Inverse reinforcement learning for robotic applications.
Degree: 2017, University of Georgia
URL: http://hdl.handle.net/10724/36625
► Robots deployed into many real-world scenarios are expected to face situations that their designers could not anticipate. Machine learning is an effective tool for extending…
(more)
▼ Robots deployed into many real-world scenarios are expected to face situations that their designers could not anticipate. Machine learning is an effective tool for extending the capabilities of these robots by allowing them to adapt their
behavior to the situation in which they find themselves. Most machine learning techniques are applicable to learning either static elements in an environment or elements with simple dynamics. We wish to address the problem of learning the behavior of
other intelligent agents that the robot may encounter. To this end, we extend a well-known Inverse Reinforcement Learning (IRL) algorithm, Maximum Entropy IRL, to address challenges expected to be encountered by autonomous robots during learning. These
include: occlusion of the observed agent’s state space due to limits of the learner’s sensors or objects in the environment, the presence of multiple agents who interact, and partial knowledge of other agents’ dynamics. Our contributions are investigated
with experiments using simulated and real world robots. These experiments include learning a fruit sorting task from human demonstrations and autonomously penetrating a perimeter patrol. Our work takes several important steps towards deploying IRL
alongside other machine learning methods for use by autonomous robots.
Subjects/Keywords: robotics; inverse reinforcement learning; machine learning; Markov decision process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Bogert, K. D. (2017). Inverse reinforcement learning for robotic applications. (Thesis). University of Georgia. Retrieved from http://hdl.handle.net/10724/36625
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Bogert, Kenneth Daniel. “Inverse reinforcement learning for robotic applications.” 2017. Thesis, University of Georgia. Accessed April 11, 2021.
http://hdl.handle.net/10724/36625.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Bogert, Kenneth Daniel. “Inverse reinforcement learning for robotic applications.” 2017. Web. 11 Apr 2021.
Vancouver:
Bogert KD. Inverse reinforcement learning for robotic applications. [Internet] [Thesis]. University of Georgia; 2017. [cited 2021 Apr 11].
Available from: http://hdl.handle.net/10724/36625.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Bogert KD. Inverse reinforcement learning for robotic applications. [Thesis]. University of Georgia; 2017. Available from: http://hdl.handle.net/10724/36625
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of New South Wales
28.
Bokani, Ayub.
Dynamic adaptation of HTTP-based video streaming using Markov decision process.
Degree: Computer Science & Engineering, 2015, University of New South Wales
URL: http://handle.unsw.edu.au/1959.4/55827
;
https://unsworks.unsw.edu.au/fapi/datastream/unsworks:39485/SOURCE02?view=true
► Hypertext transfer protocol (HTTP) is the fundamental mechanics supporting web browsing on the Internet. An HTTP server stores large volumes of contents and delivers specific…
(more)
▼ Hypertext transfer protocol (HTTP) is the fundamental mechanics supporting web browsing on the Internet. An HTTP server stores large volumes of contents and delivers specific pieces to the clients when requested. There is a recent move to use HTTP for video streaming as well, which promises seamless integration of video delivery to existing HTTP-based server platforms. This is achieved by segmenting the video into many small chunks and storing these chunks as separate files on the server. For adaptive streaming, the server stores different quality versions of the same chunk in different files to allow real-time quality adaptation of the video due to network bandwidth variation experienced by a client. For each chunk of the video, which quality version to download, therefore, becomes a major
decision-making challenge for the streaming client, especially in vehicular environment with significant uncertainty in mobile bandwidth. The key objective of this thesis is to explore more advanced
decision making tools that would enable an improved tradeoff between conflicting QoE metrics in vehicular environments. In particular, this thesis studies the effectiveness of
Markov decision process (MDP), which is known for its ability to optimize
decision making under uncertainty. The thesis makes three fundamental contributions: (1) using real video and network bandwidth datasets, it shows that MDP can reduce playback deadline miss of the video (video freezing) by up to 15 times compared to a well known non-MDP strategy when the bandwidth model is known a priori, (2) it proposes a Q-learning implementation of MDP that does not need any a priori knowledge of the bandwidth, but learns optimal
decision making in a self-learning manner by simply observing the outcome of its
decision making. It is demonstrated that, in terms of deadline miss, the Q-learning-based MDP outperforms the model-based MDP by a factor of three, and (3) it implements the proposed
decision making framework in an Android framework and demonstrates the effectiveness of the proposed MDP-based video adaptation through real experiments.
Advisors/Committee Members: Hassan, Mahbub, Computer Science & Engineering, Faculty of Engineering, UNSW.
Subjects/Keywords: Dinamic Adaptive Streaming over HTTP; Video Streaming; Markov Decision Process; DASH
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Bokani, A. (2015). Dynamic adaptation of HTTP-based video streaming using Markov decision process. (Doctoral Dissertation). University of New South Wales. Retrieved from http://handle.unsw.edu.au/1959.4/55827 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:39485/SOURCE02?view=true
Chicago Manual of Style (16th Edition):
Bokani, Ayub. “Dynamic adaptation of HTTP-based video streaming using Markov decision process.” 2015. Doctoral Dissertation, University of New South Wales. Accessed April 11, 2021.
http://handle.unsw.edu.au/1959.4/55827 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:39485/SOURCE02?view=true.
MLA Handbook (7th Edition):
Bokani, Ayub. “Dynamic adaptation of HTTP-based video streaming using Markov decision process.” 2015. Web. 11 Apr 2021.
Vancouver:
Bokani A. Dynamic adaptation of HTTP-based video streaming using Markov decision process. [Internet] [Doctoral dissertation]. University of New South Wales; 2015. [cited 2021 Apr 11].
Available from: http://handle.unsw.edu.au/1959.4/55827 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:39485/SOURCE02?view=true.
Council of Science Editors:
Bokani A. Dynamic adaptation of HTTP-based video streaming using Markov decision process. [Doctoral Dissertation]. University of New South Wales; 2015. Available from: http://handle.unsw.edu.au/1959.4/55827 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:39485/SOURCE02?view=true

The Ohio State University
29.
Swang, Theodore W, II.
A Mathematical Model for the Energy Allocation Function of
Sleep.
Degree: PhD, Mathematics, 2017, The Ohio State University
URL: http://rave.ohiolink.edu/etdc/view?acc_num=osu1483392711778623
► The function of sleep remains one of the greatest unsolved questions in biology. Schmidt has proposed the unifying Energy Allocation Function of sleep, which posits…
(more)
▼ The function of sleep remains one of the greatest
unsolved questions in biology. Schmidt has proposed the unifying
Energy Allocation Function of sleep, which posits that the ultimate
function of sleep is effective energy allocation in the service of
state-dependent division of labor, or repartitioning of metabolic
operations. We present a mathematical model based on Schmidt's
Energy Allocation model.The fundamental quantity we model is called
biological debt (BD). We define biological requirements (BR) as the
summation of maintenance obligations generated by all metabolic
operations, biological investment (BI) as the summation of
completed functions servicing this requirements, and BD as the
difference (BD = BR - BI). We model BD as a discontinuous
non-autonomous ordinary differential equation. We analyze
bifurcations as well as existence and nonexistence of limit cycles.
In order to apply the theory of averaging, we construct a smooth
approximation to the equation for BD, and show this approximation
undergoes a saddle-node bifurcation of limit cycles.We compare and
contrast our model with the Borbely's two-
process model of sleep
and with empirical data of human neurobehavioural performance.We
define a division of labor parameter (DOL) and use BD to develop an
algorithm to compute the energy saved by sleep-wake cycling
compared to continuous wakefulness. We quantify the contributions
to energy savings from DOL and from metabolic rate reduction during
sleep. We numerically compute energy savings with this method,
finding substantially greater savings than previous estimates.Some
implications of the energy savings model include predictions that
biological debt may govern sleep homeostasis; that short sleepers
may increased metabolic rate in sleep compared to long sleepers,
for whom energy savings may be primarily derived from metabolic
rate reduction; and that dampening circadian amplitude during
periods of sleep deprivation may be an adaptive feature.We present
an alternative energy savings calculation based on averaging theory
and compare it to our original energy savings computation.Finally,
we develop a
Markov Decision Process with a reward of net energy
intake in order to find an optimal sleep-wake policy under a
variety of conditions. We use this
Markov Decision Process to
optimize a policy under three different sets of
conditions.
Advisors/Committee Members: Best, Janet (Advisor).
Subjects/Keywords: Mathematics; Sleep; mathematical biology; differential equations; Markov decision process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Swang, Theodore W, I. (2017). A Mathematical Model for the Energy Allocation Function of
Sleep. (Doctoral Dissertation). The Ohio State University. Retrieved from http://rave.ohiolink.edu/etdc/view?acc_num=osu1483392711778623
Chicago Manual of Style (16th Edition):
Swang, Theodore W, II. “A Mathematical Model for the Energy Allocation Function of
Sleep.” 2017. Doctoral Dissertation, The Ohio State University. Accessed April 11, 2021.
http://rave.ohiolink.edu/etdc/view?acc_num=osu1483392711778623.
MLA Handbook (7th Edition):
Swang, Theodore W, II. “A Mathematical Model for the Energy Allocation Function of
Sleep.” 2017. Web. 11 Apr 2021.
Vancouver:
Swang, Theodore W I. A Mathematical Model for the Energy Allocation Function of
Sleep. [Internet] [Doctoral dissertation]. The Ohio State University; 2017. [cited 2021 Apr 11].
Available from: http://rave.ohiolink.edu/etdc/view?acc_num=osu1483392711778623.
Council of Science Editors:
Swang, Theodore W I. A Mathematical Model for the Energy Allocation Function of
Sleep. [Doctoral Dissertation]. The Ohio State University; 2017. Available from: http://rave.ohiolink.edu/etdc/view?acc_num=osu1483392711778623

Texas State University – San Marcos
30.
Kelly, Janiece.
The Effect of Decoy Attacks on Dynamic Channel Assignment.
Degree: MS, Computer Science, 2014, Texas State University – San Marcos
URL: https://digital.library.txstate.edu/handle/10877/6370
► As networks grow rapidly denser with the introduction of wireless-enabled cars, wearables and appliances, signal interference coupled with limited radio spectrum availability emerges as a…
(more)
▼ As networks grow rapidly denser with the introduction of wireless-enabled cars, wearables and appliances, signal interference coupled with limited radio spectrum availability emerges as a significant hindrance to network performance. In order to retain high network throughput, channels must be strategically assigned to nodes in a way that minimizes signal overlap between neighboring nodes. Current static techniques for channel assignment are intolerant of network variations and growth, but flexible dynamic assignment techniques are becoming more feasible with the introduction of software defined networks and network function virtualization. Virtualized networks abstract hardware functions to software, making tasks such as channel assignment much more reactive and suitable for automation. As network maintenance tasks are increasingly handled by software, however, network stability becomes susceptible to malicious behavior. In this thesis, we expose and study the effect of stealthy attacks that aim to trigger unnecessary channel switching in a network and increase signal interference. We develop a
Markov Decision Problem (MDP) framework and investigate suboptimal attack policies applied to a number of real-world topologies. We derive attack policies as an approximate MDP solution due to the exponentially large state space. Determining vulnerabilities to stealthy attacks is necessary in order to improve the security and stability of software defined networks.
Advisors/Committee Members: Guirguis, Mina (advisor), Gu, Qijun (committee member), Ferrero, Daniela (committee member).
Subjects/Keywords: Computer science; Security; Dynamic channel assignment; Markov decision process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Kelly, J. (2014). The Effect of Decoy Attacks on Dynamic Channel Assignment. (Masters Thesis). Texas State University – San Marcos. Retrieved from https://digital.library.txstate.edu/handle/10877/6370
Chicago Manual of Style (16th Edition):
Kelly, Janiece. “The Effect of Decoy Attacks on Dynamic Channel Assignment.” 2014. Masters Thesis, Texas State University – San Marcos. Accessed April 11, 2021.
https://digital.library.txstate.edu/handle/10877/6370.
MLA Handbook (7th Edition):
Kelly, Janiece. “The Effect of Decoy Attacks on Dynamic Channel Assignment.” 2014. Web. 11 Apr 2021.
Vancouver:
Kelly J. The Effect of Decoy Attacks on Dynamic Channel Assignment. [Internet] [Masters thesis]. Texas State University – San Marcos; 2014. [cited 2021 Apr 11].
Available from: https://digital.library.txstate.edu/handle/10877/6370.
Council of Science Editors:
Kelly J. The Effect of Decoy Attacks on Dynamic Channel Assignment. [Masters Thesis]. Texas State University – San Marcos; 2014. Available from: https://digital.library.txstate.edu/handle/10877/6370
◁ [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] ▶
.