You searched for subject:(Sequential Decision Making)
.
Showing records 1 – 30 of
62 total matches.
◁ [1] [2] [3] ▶
1.
Sodomka, Eric M.
Prediction and Optimization Abstractions in Market
Games.
Degree: PhD, Computer Science, 2015, Brown University
URL: https://repository.library.brown.edu/studio/item/bdr:419462/
► The purpose of this thesis is to better understand how to make good decisions in markets. Decision-making in markets is notoriously hard: a buyer or…
(more)
▼ The purpose of this thesis is to better understand how
to make good decisions in markets.
Decision-
making in markets is
notoriously hard: a buyer or seller must choose from many possible
actions in a stochastic, partially-observable, multi-agent
environment, and must make these choices repeatedly over time. To
make decisions in practice, buyers and sellers often simplify the
problem with abstractions. A common abstraction, central to this
thesis, is one in which agents make various predictions about the
current and future states of the world, and then optimize with
respect to those predictions. But this begs the question: what
prediction and optimization abstractions lead to effective
decisions? In this thesis, I present case studies in which I
designed, implemented, and analyzed performance of autonomous
agents in seven particular market domains. These domains include
trading agent competitions, simpler auction domains, and real-world
online advertising domains. For each market domain, I formalize a
game-theoretic (or in some special cases,
decision-theoretic) model
for
making decisions in that domain. I characterize the resulting
problem structure that can potentially be exploited by
abstractions. I develop autonomous agents specialized for
making
decisions in each domain, and describe these agents in terms of
their prediction and optimization abstractions (and algorithms for
solving those abstractions). I demonstrate the effectiveness of
these abstractions through empirical game-theoretic analyses,
evaluation of TAC tournament performance, and worst-case bounds on
regret for
making different problem abstractions. As a step towards
understanding what problem structure exists and what abstractions
are effective across market domains, I developed a working taxonomy
of prediction and optimization abstractions that were frequently
effective in these particular market domains.
Advisors/Committee Members: Greenwald, Amy (Director), Littman, Michael (Reader), Wellman, Michael (Reader).
Subjects/Keywords: sequential decision-making
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Sodomka, E. M. (2015). Prediction and Optimization Abstractions in Market
Games. (Doctoral Dissertation). Brown University. Retrieved from https://repository.library.brown.edu/studio/item/bdr:419462/
Chicago Manual of Style (16th Edition):
Sodomka, Eric M. “Prediction and Optimization Abstractions in Market
Games.” 2015. Doctoral Dissertation, Brown University. Accessed March 04, 2021.
https://repository.library.brown.edu/studio/item/bdr:419462/.
MLA Handbook (7th Edition):
Sodomka, Eric M. “Prediction and Optimization Abstractions in Market
Games.” 2015. Web. 04 Mar 2021.
Vancouver:
Sodomka EM. Prediction and Optimization Abstractions in Market
Games. [Internet] [Doctoral dissertation]. Brown University; 2015. [cited 2021 Mar 04].
Available from: https://repository.library.brown.edu/studio/item/bdr:419462/.
Council of Science Editors:
Sodomka EM. Prediction and Optimization Abstractions in Market
Games. [Doctoral Dissertation]. Brown University; 2015. Available from: https://repository.library.brown.edu/studio/item/bdr:419462/

University of Adelaide
2.
Gokaydin, Dinis Sequeira Do Couto.
The structure of sequential effects.
Degree: 2015, University of Adelaide
URL: http://hdl.handle.net/2440/98135
► Research into sequential effects has a long and rich history spanning almost one hundred years. In their most general definition sequential effects can simply be…
(more)
▼ Research into
sequential effects has a long and rich history spanning almost one hundred years. In their most general definition
sequential effects can simply be considered a dependence of behaviour on the past sequence of events, and are of the most pervasive phenomena in psychology. Some form of
sequential effects has been observed in multiple perceptual and cognitive tasks, and across different modalities. In addition,
sequential effects have also been observed in electrophysiological studies, with a great deal of similarity observed between EEG and behavioural results,
making this a relevant topic for both psychology and neuroscience. This gives
sequential effects a great deal of potential as a doorway for elucidating the relationship between human behaviour and neuronal activity, between the mind and the brain. Yet perhaps in part because of the great diversity of domains in which
sequential effects are observed, this is an often fragmented field of research, with a multitude of experimental paradigms used, often leading to some confusion as to how different results are related to each other. One of the main objectives of this work is therefore to begin to unify the field into one coherent whole, and to do so at both a computational and process levels. To begin with, Chapter 2 addresses the computational nature of
sequential effects in terms of different types of statistics humans use in different circumstances. In Chapter 3 it is shown that the most results described before in the literature can be explained by only three components, including a wealth of individual differences which had been largely ignored so far. On a more theoretical level it could be argued that there is a degree of redundancy between the various mathematical models of
sequential effects proposed over the years. Models are usually fit to isolated datasets, when it is well known that even minor experimental manipulations can lead to different results,
making it unclear how conclusions extend to other settings. Moreover, by virtue of their common mathematical structure, most models of
sequential effects suffer from similar difficulties in reproducing key empirical observations. This, together with other considerations, motivates an entirely different approach to modelling
sequential effects proposed in Chapter 4. The framework suggested is based on the physics of oscillatory motion, being continuous-time in nature and able to incorporate space, reflecting the fact that both time and space have been found empirically to play a role in
sequential effects. More generally there are two central proposals which unify this dissertation. Firstly that
sequential effects are the consequence of two main independent components possibly related to the separate processing of stimuli and responses. Secondly that
sequential effects reflect some form of filtering implemented through interaction with an oscillatory system.
Advisors/Committee Members: Navarro, Daniel Joseph (advisor), Ma-Wyatt, Anna (advisor), Perfors, Amy Francesca (advisor), School of Psychology (school).
Subjects/Keywords: sequential effects; reaction time; decision making
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Gokaydin, D. S. D. C. (2015). The structure of sequential effects. (Thesis). University of Adelaide. Retrieved from http://hdl.handle.net/2440/98135
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Gokaydin, Dinis Sequeira Do Couto. “The structure of sequential effects.” 2015. Thesis, University of Adelaide. Accessed March 04, 2021.
http://hdl.handle.net/2440/98135.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Gokaydin, Dinis Sequeira Do Couto. “The structure of sequential effects.” 2015. Web. 04 Mar 2021.
Vancouver:
Gokaydin DSDC. The structure of sequential effects. [Internet] [Thesis]. University of Adelaide; 2015. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/2440/98135.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Gokaydin DSDC. The structure of sequential effects. [Thesis]. University of Adelaide; 2015. Available from: http://hdl.handle.net/2440/98135
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Alberta
3.
Akkoc, Ali Utku.
How Choosing for Others Affects Consumption for the Self:
The Consequences of Preference Imposition and Accommodation.
Degree: PhD, Faculty of Business, 2015, University of Alberta
URL: https://era.library.ualberta.ca/files/cv43p0567
► Consumers make choices not only for themselves but for others. For example, parents make a variety of consumption decisions in a typical day for themselves…
(more)
▼ Consumers make choices not only for themselves but for
others. For example, parents make a variety of consumption
decisions in a typical day for themselves as well as for their
family. Yet, little is known about how decisions made for others
influence the decision maker’s subsequent consumption. Identifying
two approaches—imposition and accommodation—that are available to
decision makers, this dissertation provides insights into how
choosing for another person affects the healthiness of one’s own
subsequent consumption preferences and tests a power-based
framework which explains the psychological process underlying this
phenomenon. Looking at consumption choices adults make for
children, four experiments and one field study demonstrate that
imposing a consumption choice on others makes individuals feel more
powerful relative to accommodating the target’s preferences, and
subsequently leads decision makers to make more indulgent choices
for themselves. Moreover, findings show that the social context of
consumption moderates the effects of imposition and accommodation
on the decision maker’s own choices. Finally, the results rule out
licensing and guilt as alternative explanations. Implications for
theory and practice are discussed.
Subjects/Keywords: Self-other Decision Making; Sequential Decision Making; Imposition and Accommodation; Power and Indulgence
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Akkoc, A. U. (2015). How Choosing for Others Affects Consumption for the Self:
The Consequences of Preference Imposition and Accommodation. (Doctoral Dissertation). University of Alberta. Retrieved from https://era.library.ualberta.ca/files/cv43p0567
Chicago Manual of Style (16th Edition):
Akkoc, Ali Utku. “How Choosing for Others Affects Consumption for the Self:
The Consequences of Preference Imposition and Accommodation.” 2015. Doctoral Dissertation, University of Alberta. Accessed March 04, 2021.
https://era.library.ualberta.ca/files/cv43p0567.
MLA Handbook (7th Edition):
Akkoc, Ali Utku. “How Choosing for Others Affects Consumption for the Self:
The Consequences of Preference Imposition and Accommodation.” 2015. Web. 04 Mar 2021.
Vancouver:
Akkoc AU. How Choosing for Others Affects Consumption for the Self:
The Consequences of Preference Imposition and Accommodation. [Internet] [Doctoral dissertation]. University of Alberta; 2015. [cited 2021 Mar 04].
Available from: https://era.library.ualberta.ca/files/cv43p0567.
Council of Science Editors:
Akkoc AU. How Choosing for Others Affects Consumption for the Self:
The Consequences of Preference Imposition and Accommodation. [Doctoral Dissertation]. University of Alberta; 2015. Available from: https://era.library.ualberta.ca/files/cv43p0567

University of Newcastle
4.
Tillman, Gabriel.
Advancing methods and mathematical models of perceptual decision making.
Degree: PhD, 2017, University of Newcastle
URL: http://hdl.handle.net/1959.13/1335615
► Research Doctorate - Doctor of Philosophy (PhD)
In this thesis I argue that cognitive psychologists can use the combination of sequential sampling models, Bayesian estimation…
(more)
▼ Research Doctorate - Doctor of Philosophy (PhD)
In this thesis I argue that cognitive psychologists can use the combination of sequential sampling models, Bayesian estimation methods, and model comparison via predictive accuracy to investigate underlying cognitive processes of perceptual decision-making. I show that sequential sampling models of simple and choice response time allow for researchers to analyze behavioral data and translate them into the constitute components of processing, such as speed of processing, response caution, and the time needed for perceptual encoding and overt motor responses. I use these methods and models to investigate underlying mental processes related to cognitive load, speech perception, and lexical decision-making. I also show that using different sequential sampling models to analyze the same data can lead researchers to draw different conclusions about cognitive processes, which serves as a caution for carelessly using these models. I also present a novel method that researchers can use to observe cognitive processes unfold online during perceptual decision-making tasks. I then discuss a promising collaboration emerging between researchers in the field of mathematical modeling and neuroscience.
Advisors/Committee Members: University of Newcastle. Faculty of Science, School of Psychology.
Subjects/Keywords: response time; perceptual decision making; sequential sampling models; thesis by publication
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Tillman, G. (2017). Advancing methods and mathematical models of perceptual decision making. (Doctoral Dissertation). University of Newcastle. Retrieved from http://hdl.handle.net/1959.13/1335615
Chicago Manual of Style (16th Edition):
Tillman, Gabriel. “Advancing methods and mathematical models of perceptual decision making.” 2017. Doctoral Dissertation, University of Newcastle. Accessed March 04, 2021.
http://hdl.handle.net/1959.13/1335615.
MLA Handbook (7th Edition):
Tillman, Gabriel. “Advancing methods and mathematical models of perceptual decision making.” 2017. Web. 04 Mar 2021.
Vancouver:
Tillman G. Advancing methods and mathematical models of perceptual decision making. [Internet] [Doctoral dissertation]. University of Newcastle; 2017. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/1959.13/1335615.
Council of Science Editors:
Tillman G. Advancing methods and mathematical models of perceptual decision making. [Doctoral Dissertation]. University of Newcastle; 2017. Available from: http://hdl.handle.net/1959.13/1335615

University of Sydney
5.
Zhang, Huihui.
Temporal dependence in perceptual decision-making: behavioural oscillations and sequential effects
.
Degree: 2019, University of Sydney
URL: http://hdl.handle.net/2123/20847
► Dynamic brain states influence perceptual decision making, especially when the immediate sensory evidence is noisy. In this thesis, I examine how perceptual decision making is…
(more)
▼ Dynamic brain states influence perceptual decision making, especially when the immediate sensory evidence is noisy. In this thesis, I examine how perceptual decision making is shaped by two types of temporal contexts, one characterised by intrinsic temporal organisation of neural processing (i.e., neural oscillations) and one characterised by the recent history of perceptual, decisional, and motor experience (i.e., sequential effects). Chapter 2 examines rhythmic fluctuations of behavioural performance in visual orientation discrimination. The results indicate that sensitivity and response bias both modulate rhythmically over time: ~8 Hz for sensitivity, and ~10 Hz for response bias. Chapter 3 examines sequential effects in visual orientation discrimination. Analysing the data-set from chapter 2 shows that it’s the response, rather than the stimulus, that carries over to the next trial. Moreover, the one-trial back sequential effect shows individual differences (positive or negative). In a new experiment with a trial-by-trial random stimulus-response mapping, the choice effect was consistently positive and the motor effect was consistently repulsive, suggesting that the individual differences may be caused by different relative weightings of the perceptual decision and the motor response. Chapter 4 further examines sequential effects with auditory stimuli of different morph levels in two dimensions: gender, syllable (ba/da). Observers reported male or female and ba or da at the same time. For gender, decisions were biased towards the previous choice, and this bias was modulated by the similarity between previous and current stimuli – being stronger when the stimuli were similar. For syllable, the same bias towards previous choice was found, but the dependence was not modulated by similarity between temporally adjacent stimuli. To summarise, these findings reveal that both intrinsic neural oscillations and past history shape the current perceptual decision making.
Subjects/Keywords: serial dependence;
sequential effects;
behavioural oscillation;
perceptual decision making;
bias
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhang, H. (2019). Temporal dependence in perceptual decision-making: behavioural oscillations and sequential effects
. (Thesis). University of Sydney. Retrieved from http://hdl.handle.net/2123/20847
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Zhang, Huihui. “Temporal dependence in perceptual decision-making: behavioural oscillations and sequential effects
.” 2019. Thesis, University of Sydney. Accessed March 04, 2021.
http://hdl.handle.net/2123/20847.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Zhang, Huihui. “Temporal dependence in perceptual decision-making: behavioural oscillations and sequential effects
.” 2019. Web. 04 Mar 2021.
Vancouver:
Zhang H. Temporal dependence in perceptual decision-making: behavioural oscillations and sequential effects
. [Internet] [Thesis]. University of Sydney; 2019. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/2123/20847.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Zhang H. Temporal dependence in perceptual decision-making: behavioural oscillations and sequential effects
. [Thesis]. University of Sydney; 2019. Available from: http://hdl.handle.net/2123/20847
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
6.
Liebman, Elad.
Sequential decision making in artificial musical intelligence.
Degree: PhD, Computer Science, 2019, University of Texas – Austin
URL: http://dx.doi.org/10.26153/tsw/2858
► Over the past 60 years, artificial intelligence has grown from a largely academic field of research to a ubiquitous array of tools and approaches used…
(more)
▼ Over the past 60 years, artificial intelligence has grown from a largely academic field of research to a ubiquitous array of tools and approaches used in everyday technology. Despite its many recent successes and growing prevalence, certain meaningful facets of computational intelligence have not been as thoroughly explored. Such additional facets cover a wide array of complex mental tasks which humans carry out easily, yet are difficult for computers to mimic. A prime example of a domain in which human intelligence thrives, but machine understanding is still fairly limited, is music. Over the last decade, many researchers have applied computational tools to carry out tasks such as genre identification, music summarization, music database querying, and melodic segmentation. While these are all useful algorithmic solutions, we are still a long way from constructing complete music agents, able to mimic (at least partially) the complexity with which humans approach music. One key aspect which hasn't been sufficiently studied is that of
sequential decision making in musical intelligence. This thesis strives to answer the following question: Can a
sequential decision making perspective guide us in the creation of better music agents, and social agents in general? And if so, how? More specifically, this thesis focuses on two aspects of musical intelligence: music recommendation and human-agent (and more generally agent-agent) interaction in the context of music. The key contributions of this thesis are the design of better music playlist recommendation algorithms; the design of algorithms for tracking user preferences over time; new approaches for modeling people's behavior in situations that involve music; and the design of agents capable of meaningful interaction with humans and other agents in a setting where music plays a roll (either directly or indirectly). Though motivated primarily by music-related tasks, and focusing largely on people's musical preferences, this thesis also establishes that insights from music-specific case studies can also be applicable in other concrete social domains, such as different types of content recommendation. Showing the generality of insights from musical data in other contexts serves as evidence for the utility of music domains as testbeds for the development of general artificial intelligence techniques. Ultimately, this thesis demonstrates the overall usefulness of taking a
sequential decision making approach in settings previously unexplored from this perspective
Advisors/Committee Members: Stone, Peter, 1971- (advisor), Dannenberg, Roger (committee member), Grauman, Kristen (committee member), Niekum, Scott (committee member), Saar-Tsechansky, Maytal (committee member).
Subjects/Keywords: Music informatics; Reinforcement learning; Artificial intelligence; Sequential decision-making
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Liebman, E. (2019). Sequential decision making in artificial musical intelligence. (Doctoral Dissertation). University of Texas – Austin. Retrieved from http://dx.doi.org/10.26153/tsw/2858
Chicago Manual of Style (16th Edition):
Liebman, Elad. “Sequential decision making in artificial musical intelligence.” 2019. Doctoral Dissertation, University of Texas – Austin. Accessed March 04, 2021.
http://dx.doi.org/10.26153/tsw/2858.
MLA Handbook (7th Edition):
Liebman, Elad. “Sequential decision making in artificial musical intelligence.” 2019. Web. 04 Mar 2021.
Vancouver:
Liebman E. Sequential decision making in artificial musical intelligence. [Internet] [Doctoral dissertation]. University of Texas – Austin; 2019. [cited 2021 Mar 04].
Available from: http://dx.doi.org/10.26153/tsw/2858.
Council of Science Editors:
Liebman E. Sequential decision making in artificial musical intelligence. [Doctoral Dissertation]. University of Texas – Austin; 2019. Available from: http://dx.doi.org/10.26153/tsw/2858

Penn State University
7.
Miller, Simon W.
Design as a Sequential Decision Process: A Method for Reducing Set Space Using Models to Bound Objectives.
Degree: 2017, Penn State University
URL: https://submit-etda.libraries.psu.edu/catalog/13724swm154
► Complex engineered systems rely heavily on simulation models for design. Designing such systems typically involves a sequential decision process that increases the modeling and engineering…
(more)
▼ Complex engineered systems rely heavily on simulation models for design. Designing such systems typically involves a
sequential decision process that increases the modeling and engineering analysis detail while simultaneously decreasing the space of alternatives considered. This space of alternatives is called the tradespace, and in a
decision theoretic framework, low-fidelity models help
decision-makers identify regions of interest in the tradespace and cull others prior to constructing more computationally-expensive higher fidelity models to further discriminate alternatives in support of a model-based systems engineering process.
Previous work has shown that multi-fidelity modeling can aid in rapid optimization of the design space when higher fidelity models \emph{are coupled with} lower fidelity models, i.e., it is assumed that both models are available, and the two are used as a computational ``trick'' to speedup optimization routines. This dissertation introduces and motivates the
sequential process of design through a formal model. The formal model seeks to answer (1) what structure models of varying fidelities should have relative to one another, (2) a linkage between decisions made from each modeling effort, and (3) a method for selecting a sequence of computational models and analyses when considering design as a \emph{
sequential decision process}. The method presented herein demonstrates design as a sequence of finite
decision epochs through a tradespace defined by the extent of the set of designs under consideration, modeling characteristics, and the level of analytic fidelity subjected to each design. This work introduces a method for constructing cost-optimal sequences of models of increasing fidelity by reducing the design set size under the guarantee that the optimal solution remains in the consideration set at each epoch.
A summary of relevant literature in design sciences, social sciences, and engineering optimization communities is given to scope the research. The formal model is presented with its assumptions, definitions, and applications as well as a method for applying the model to an engineering system. The method is applied to several examples including 1D and 2D finite element analysis and a bi-level optimization problem with a combinatorial kernel based on a 2D bin packing problem. It is demonstrated that by treating design as a
sequential decision process, significant savings (e.g., time, money) can be obtained.
Advisors/Committee Members: Timothy W Simpson, Dissertation Advisor/Co-Advisor, Timothy W Simpson, Committee Chair/Co-Chair, Mary Frecker, Committee Member, Michael A Yukish, Committee Member, Gordon Warn, Outside Member, Mark Traband, Committee Member.
Subjects/Keywords: design; decision-making; sequential decision; bounding model; interval dominance; discriminatory; model; set-based design; fidelity
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Miller, S. W. (2017). Design as a Sequential Decision Process: A Method for Reducing Set Space Using Models to Bound Objectives. (Thesis). Penn State University. Retrieved from https://submit-etda.libraries.psu.edu/catalog/13724swm154
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Miller, Simon W. “Design as a Sequential Decision Process: A Method for Reducing Set Space Using Models to Bound Objectives.” 2017. Thesis, Penn State University. Accessed March 04, 2021.
https://submit-etda.libraries.psu.edu/catalog/13724swm154.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Miller, Simon W. “Design as a Sequential Decision Process: A Method for Reducing Set Space Using Models to Bound Objectives.” 2017. Web. 04 Mar 2021.
Vancouver:
Miller SW. Design as a Sequential Decision Process: A Method for Reducing Set Space Using Models to Bound Objectives. [Internet] [Thesis]. Penn State University; 2017. [cited 2021 Mar 04].
Available from: https://submit-etda.libraries.psu.edu/catalog/13724swm154.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Miller SW. Design as a Sequential Decision Process: A Method for Reducing Set Space Using Models to Bound Objectives. [Thesis]. Penn State University; 2017. Available from: https://submit-etda.libraries.psu.edu/catalog/13724swm154
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Toronto
8.
Perrault, Andrew.
Developing and Coordinating Autonomous Agents for Efficient Electricity Markets.
Degree: PhD, 2018, University of Toronto
URL: http://hdl.handle.net/1807/92135
► Whether for environmental, conservation, efficiency, or economic reasons, developing next generation electric power infrastructure is critical. Temporally relevant, granular data from smart meters provide new…
(more)
▼ Whether for environmental, conservation, efficiency, or economic reasons, developing next generation electric power infrastructure is critical. Temporally relevant, granular data from smart meters provide new opportunities for data-driven management of the power grid. New developments—for example, electricity markets with multiple suppliers, the integration of renewable power sources into the system, and spikier demand patterns due to, say, electric vehicles—create new challenges for efficient grid operation. Computer science is uniquely positioned to assist with increasingly sophisticated techniques for handling and learning from large amounts of data. The methods of game theory and multi-agent systems provide a natural framework for modeling the competing incentives of electricity market participants. This thesis focuses on the use of learning, optimization, mechanism design, and preference elicitation methods to coordinate electricity demand and supply while respecting the incentives of market participants. Specifically, we propose an approach where an autonomous agent acts on behalf of each household, coordinating with inhabitants to relay information and make decisions on their behalf about electricity consumption. We focus on three problems that arise in developing such agents: (i) how to coordinate consumers' electricity use, (ii) how to share the costs of consumption among households (via their agents), and (iii) how to gather consumption preference data from consumers.
Chapters 3 and 4 focus on different aspects of the first two problems. Both use a matching markets approach. In Chapter 3, we focus on the impact of demand smoothness and peaks on the supplier’s cost, and in Chapter 4, on the impact of predictability. In both chapters, we develop new cost sharing schemes that are resilient to certain forms of strategic behavior on the part of the agents and that achieve strong performance in experiments.
Chapter 5 studies the third problem. Motivated by control of heating and cooling systems, we present a new approach to preference elicitation, where the cost and accuracy of query responses is dependent on the user’s familiarity with the conditions specified in the query. We show that despite the theoretical difficulty in this setting, we can build solvers that perform well in practice.
Advisors/Committee Members: Boutilier, Craig, Computer Science.
Subjects/Keywords: cooperative games; energy; market design; Markov decision processes; preference elicitation; sequential decision-making; 0984
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Perrault, A. (2018). Developing and Coordinating Autonomous Agents for Efficient Electricity Markets. (Doctoral Dissertation). University of Toronto. Retrieved from http://hdl.handle.net/1807/92135
Chicago Manual of Style (16th Edition):
Perrault, Andrew. “Developing and Coordinating Autonomous Agents for Efficient Electricity Markets.” 2018. Doctoral Dissertation, University of Toronto. Accessed March 04, 2021.
http://hdl.handle.net/1807/92135.
MLA Handbook (7th Edition):
Perrault, Andrew. “Developing and Coordinating Autonomous Agents for Efficient Electricity Markets.” 2018. Web. 04 Mar 2021.
Vancouver:
Perrault A. Developing and Coordinating Autonomous Agents for Efficient Electricity Markets. [Internet] [Doctoral dissertation]. University of Toronto; 2018. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/1807/92135.
Council of Science Editors:
Perrault A. Developing and Coordinating Autonomous Agents for Efficient Electricity Markets. [Doctoral Dissertation]. University of Toronto; 2018. Available from: http://hdl.handle.net/1807/92135

University of Michigan
9.
Zhang, Qi.
Making and Keeping Probabilistic Commitments for Trustworthy Multiagent Coordination.
Degree: PhD, Computer Science & Engineering, 2020, University of Michigan
URL: http://hdl.handle.net/2027.42/162948
► In a large number of real world domains, such as the control of autonomous vehicles, team sports, medical diagnosis and treatment, and many others, multiple…
(more)
▼ In a large number of real world domains, such as the control of autonomous vehicles, team sports, medical diagnosis and treatment, and many others, multiple autonomous agents need to take actions based on local observations, and are interdependent in the sense that they rely on each other to accomplish tasks. Thus, achieving desired outcomes in these domains requires interagent coordination. The form of coordination this thesis focuses on is commitments, where an agent, referred to as the commitment provider, specifies guarantees about its behavior to another, referred to as the commitment recipient, so that the recipient can plan and execute accordingly without taking into account the details of the provider's behavior. This thesis grounds the concept of commitments into
decision-theoretic settings where the provider's guarantees might have to be probabilistic when its actions have stochastic outcomes and it expects to reduce its uncertainty about the environment during execution.
More concretely, this thesis presents a set of contributions that address three core issues for commitment-based coordination: probabilistic commitment adherence, interpretation, and formulation. The first contribution is a principled semantics for the provider to exercise maximal autonomy that responds to evolving knowledge about the environment without violating its probabilistic commitment, along with a family of algorithms for the provider to construct policies that provably respect the semantics and make explicit tradeoffs between computation cost and plan quality. The second contribution consists of theoretical analyses and empirical studies that improve our understanding of the recipient's interpretation of the partial information specified in a probabilistic commitment; the thesis shows that it is inherently easier for the recipient to robustly model a probabilistic commitment where the provider promises to enable preconditions that the recipient requires than where the provider instead promises to avoid changing already-enabled preconditions. The third contribution focuses on the problem of formulating probabilistic commitments for the fully cooperative provider and recipient; the thesis proves structural properties of the agents' values as functions of the parameters of the commitment specification that can be exploited to achieve orders of magnitude less computation for 1) formulating optimal commitments in a centralized manner, and 2) formulating (approximately) optimal queries that induce (approximately) optimal commitments for the decentralized setting in which information relevant to optimization is distributed among the agents.
Advisors/Committee Members: Baveja, Satinder Singh (committee member), Durfee, Edmund H (committee member), Lewis, Richard L (committee member), Sinha, Arunesh (committee member).
Subjects/Keywords: Multiagent Coordination; Sequential Decision Making; Commitment; Markov Decision Process; Computer Science; Engineering
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhang, Q. (2020). Making and Keeping Probabilistic Commitments for Trustworthy Multiagent Coordination. (Doctoral Dissertation). University of Michigan. Retrieved from http://hdl.handle.net/2027.42/162948
Chicago Manual of Style (16th Edition):
Zhang, Qi. “Making and Keeping Probabilistic Commitments for Trustworthy Multiagent Coordination.” 2020. Doctoral Dissertation, University of Michigan. Accessed March 04, 2021.
http://hdl.handle.net/2027.42/162948.
MLA Handbook (7th Edition):
Zhang, Qi. “Making and Keeping Probabilistic Commitments for Trustworthy Multiagent Coordination.” 2020. Web. 04 Mar 2021.
Vancouver:
Zhang Q. Making and Keeping Probabilistic Commitments for Trustworthy Multiagent Coordination. [Internet] [Doctoral dissertation]. University of Michigan; 2020. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/2027.42/162948.
Council of Science Editors:
Zhang Q. Making and Keeping Probabilistic Commitments for Trustworthy Multiagent Coordination. [Doctoral Dissertation]. University of Michigan; 2020. Available from: http://hdl.handle.net/2027.42/162948

Carnegie Mellon University
10.
Reddy, Prashant P.
Semi-Cooperative Learning in Smart Grid Agents.
Degree: 2013, Carnegie Mellon University
URL: http://repository.cmu.edu/dissertations/542
► Striving to reduce the environmental impact of our growing energy demand creates tough new challenges in how we generate and use electricity. We need to…
(more)
▼ Striving to reduce the environmental impact of our growing energy demand creates tough new challenges in how we generate and use electricity. We need to develop Smart Grid systems in which distributed sustainable energy resources are fully integrated and energy consumption is efficient. Customers, i.e., consumers and distributed producers, require agent technology that automates much of their decision-making to become active participants in the Smart Grid. This thesis develops models and learning algorithms for such autonomous agents in an environment where customers operate in modern retail power markets and thus have a choice of intermediary brokers with whom they can contract to buy or sell power. In this setting, customers face a learning and multiscale decision-making problem – they must manage contracts with one or more brokers and simultaneously, on a finer timescale, manage their consumption or production levels under existing contracts. On a contextual scale, they can optimize their isolated selfinterest or consider their shared goals with other agents. We advance the idea that a Learning Utility Management Agent (LUMA), or a network of such agents, deployed on behalf of a Smart Grid customer can autonomously address that customer’s multiscale decision-making responsibilities. We study several relationships between a given LUMA and other agents in the environment. These relationships are semi-cooperative and the degree of expected cooperation can change dynamically with the evolving state of the world. We exploit the multiagent structure of the problem to control the degree of partial observability. Since a large portion of relevant hidden information is visible to the other agents in the environment, we develop methods for Negotiated Learning, whereby a LUMA can offer incentives to the other agents to obtain information that sufficiently reduces its own uncertainty while trading off the cost of offering those incentives. The thesis first introduces pricing algorithms for autonomous broker agents, time series forecasting models for long range simulation, and capacity optimization algorithms for multi-dwelling customers. We then introduce Negotiable Entity Selection Processes (NESP) as a formal representation where partial observability is negotiable amongst certain classes of agents. We then develop our ATTRACTIONBOUNDED- LEARNING algorithm, which leverages the variability of hidden information for efficient multiagent learning. We apply the algorithm to address the variable-rate tariff selection and capacity aggregate management problems faced by Smart Grid customers. We evaluate the work on real data using Power TAC, an agent-based Smart Grid simulation platform and substantiate the value of autonomous Learning Utility Management Agents in the Smart Grid.
Subjects/Keywords: Semi-Cooperative Learning; Negotiated Learning; Multiagent Learning; Online Learning; Reinforcement Learning; Sequential Decision-Making
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Reddy, P. P. (2013). Semi-Cooperative Learning in Smart Grid Agents. (Thesis). Carnegie Mellon University. Retrieved from http://repository.cmu.edu/dissertations/542
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Reddy, Prashant P. “Semi-Cooperative Learning in Smart Grid Agents.” 2013. Thesis, Carnegie Mellon University. Accessed March 04, 2021.
http://repository.cmu.edu/dissertations/542.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Reddy, Prashant P. “Semi-Cooperative Learning in Smart Grid Agents.” 2013. Web. 04 Mar 2021.
Vancouver:
Reddy PP. Semi-Cooperative Learning in Smart Grid Agents. [Internet] [Thesis]. Carnegie Mellon University; 2013. [cited 2021 Mar 04].
Available from: http://repository.cmu.edu/dissertations/542.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Reddy PP. Semi-Cooperative Learning in Smart Grid Agents. [Thesis]. Carnegie Mellon University; 2013. Available from: http://repository.cmu.edu/dissertations/542
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Michigan
11.
Riis, Jason.
Experienced difficulty in sequential decision making.
Degree: PhD, Social psychology, 2003, University of Michigan
URL: http://hdl.handle.net/2027.42/123684
► Seven studies were conducted to examine the role of experiential factors in sequential decisions. People often have to make several decisions in short sequence, but…
(more)
▼ Seven studies were conducted to examine the role of experiential factors in
sequential decisions. People often have to make several decisions in short sequence, but the vast majority of academic work on
decision making has examined decisions made in isolation. Several mechanisms are described by which deliberation about a prior
decision could influence choice on a subsequent one. The studies demonstrated the operation of one class of mechanism – feelings that result from deliberating about a prior
decision can influence preferences in a subsequent
decision. Studies 1 – 3 demonstrated an order effect whereby more status quo selection was observed on a difficult
decision among subjects who were first asked to make a preliminary, unrelated difficult
decision. Studies 2 and 3 ruled out the possibility that the effect was due to a reduced tolerance for assuming responsibility or to a priming mechanism whereby the prior
decision invoked particular thoughts that were relevant to the second
decision. The effects are best explained by an account which appeals to the experienced difficulty invoked by the prior
decision. This experience of difficulty decreases the
decision maker's tolerance for uncertainty in subsequent decisions. Studies 4 and 5 extended the effect to risk aversion. Subjects who had to make a preliminary easy
decision were less likely to choose a sure thing than were subjects who had to make a preliminary hard
decision. The effect was significant in Study 5. Study 6 attempted to examine whether or not trait uncertainty would work in the same way as state uncertainty (which is an aspect of experienced difficulty). It did not, and trait uncertainty did not predict state uncertainty. Study 7 attempted a more direct manipulation of uncertainty by having one group of subjects give reasons for their choice. Consistent with previous studies, this did increase the subjects' subjective confidence (or certainty) but it did not decrease their preference for a status quo option. To the contrary, it increased that preference, but it is suggested that this manipulation incidentally manipulated accountability, perhaps to a greater degree than uncertainty, and this may have caused the effect.
Advisors/Committee Members: Schwarz, Norbert (advisor).
Subjects/Keywords: Difficulty; Experienced; Moral; Sequential Decision-making
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Riis, J. (2003). Experienced difficulty in sequential decision making. (Doctoral Dissertation). University of Michigan. Retrieved from http://hdl.handle.net/2027.42/123684
Chicago Manual of Style (16th Edition):
Riis, Jason. “Experienced difficulty in sequential decision making.” 2003. Doctoral Dissertation, University of Michigan. Accessed March 04, 2021.
http://hdl.handle.net/2027.42/123684.
MLA Handbook (7th Edition):
Riis, Jason. “Experienced difficulty in sequential decision making.” 2003. Web. 04 Mar 2021.
Vancouver:
Riis J. Experienced difficulty in sequential decision making. [Internet] [Doctoral dissertation]. University of Michigan; 2003. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/2027.42/123684.
Council of Science Editors:
Riis J. Experienced difficulty in sequential decision making. [Doctoral Dissertation]. University of Michigan; 2003. Available from: http://hdl.handle.net/2027.42/123684
12.
Judah, Kshitij.
New learning modes for sequential decision making.
Degree: PhD, Computer Science, 2014, Oregon State University
URL: http://hdl.handle.net/1957/47464
► This thesis considers the problem in which a teacher is interested in teaching action policies to computer agents for sequential decision making. The vast majority…
(more)
▼ This thesis considers the problem in which a teacher is interested in teaching action
policies to computer agents for
sequential decision making. The vast majority of policy
learning algorithms o er teachers little flexibility in how policies are taught. In particular,
one of two learning modes is typically considered: 1) Imitation learning, where
the teacher demonstrates explicit action sequences to the learner, and 2) Reinforcement
learning, where the teacher designs a reward function for the learner to autonomously
optimize via practice. This is in sharp contrast to how humans teach other humans,
where many other learning modes are commonly used besides imitation and practice.
This thesis presents novel learning modes for teaching policies to computer agents, with
the eventual aim of allowing human teachers to teach computer agents more naturally
and efficiently.
Our first learning mode is inspired by how humans learn: through rounds of practice
followed by feedback from a teacher. We adopt this mode to create computer agents that
learn from several rounds of autonomous practice followed by critique feedback from a
teacher. Our results show that this mode of policy learning is more e effective than pure
reinforcement learning, though important usability issues arise when used with human teachers.
Next we consider a learning mode where the computer agent can actively ask questions
to the teacher, which we call active imitation learning. We provide algorithms
for active imitation learning that are proven to require strictly less interaction with the
teacher than passive imitation learning. We also show that empirically active imitation learning algorithms are much more efficient than traditional passive imitation learning in terms of amount of interaction with the teacher.
Lastly, we introduce a novel imitation learning mode that allows a teacher to specify
shaping rewards to a computer agent in addition to demonstrations. Shaping rewards are
additional rewards supplied to an agent for accelerating policy learning via reinforcement
learning. We provide an algorithm to incorporate shaping rewards in imitation learning
and show that it learns from fewer demonstrations than pure imitation learning.
We wrap up by presenting a prototype User-Initiated Learning (UIL) system that
allows an end user to demonstrate procedures containing optional steps and instruct the
system to autonomously learn to predict when the optional steps should be executed, and
remind the user if they forget. Our prototype supports user-initiated demonstration and
learning via a natural interface, and has a built-in automated machine learning engine
to automatically train and install a predictor for the requested prediction problem.
Advisors/Committee Members: Fern, Alan P. (advisor), Dietterich, Thomas G. (committee member).
Subjects/Keywords: Sequential Decision Making; Machine learning
…sequential decision making. A large number of algorithms have
been proposed to solve this problem… …sequential
decision making, with the eventual aim of allowing human teachers to teach computer… …sequential decision making. Existing
approaches to policy learning give very little flexibility to… …47
4.2 Problem
4.2.1
4.2.2
4.2.3
Setup and Background . . . . .
Markov Decision Processes… …policy. The red stars represent states where the left action is suggested. The decision…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Judah, K. (2014). New learning modes for sequential decision making. (Doctoral Dissertation). Oregon State University. Retrieved from http://hdl.handle.net/1957/47464
Chicago Manual of Style (16th Edition):
Judah, Kshitij. “New learning modes for sequential decision making.” 2014. Doctoral Dissertation, Oregon State University. Accessed March 04, 2021.
http://hdl.handle.net/1957/47464.
MLA Handbook (7th Edition):
Judah, Kshitij. “New learning modes for sequential decision making.” 2014. Web. 04 Mar 2021.
Vancouver:
Judah K. New learning modes for sequential decision making. [Internet] [Doctoral dissertation]. Oregon State University; 2014. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/1957/47464.
Council of Science Editors:
Judah K. New learning modes for sequential decision making. [Doctoral Dissertation]. Oregon State University; 2014. Available from: http://hdl.handle.net/1957/47464

George Mason University
13.
DeGregory, Keith W.
An Approximate Dynamic Program for Allocating Federal Air Marshals in Near Real-Time Under Uncertainty
.
Degree: 2014, George Mason University
URL: http://hdl.handle.net/1920/9012
► The Federal Air Marshal Service provides front-line security in homeland defense by protecting civil aviation from potential terrorist attacks. Unique challenges arise in maximizing effective…
(more)
▼ The Federal Air Marshal Service provides front-line security in homeland defense by protecting civil aviation from potential terrorist attacks. Unique challenges arise in maximizing effective deployment of a limited number of air marshals to cover the risk posed by potential terrorists on nearly 30,000 daily domestic and international flights. Some risk presents in a stochastic nature (e.g., a last minute ticket sale where suspicion is aroused). Pre-scheduled air marshal deployments cannot respond to risk which presents stochastically in real-time. This dissertation proposes the formation of a quick reaction force to explicitly address stochastic risk of terrorism on commercial flights and presents a method for near real-time force allocation to optimize risk coverage.
The dynamic allocation of reactionary air marshals requires
sequential decision making under uncertainty with limited lead time. This dissertation investigates the application of an approximate dynamic program (ADP) to assist schedulers allocating air marshals in near real-time. ADP is a form of reinforced learning that seeks optimal decisions by incorporating future impacts rather than optimizing only on short-term rewards. The marshal allocation system is modeled as a Markov
decision process. Due to the many variables and environment complexity, explicit storage of all states and their values is not possible. Value function approximation schemes are explored to mitigate scalability challenges by alleviating the need for state value storage. The study demonstrates that air marshal allocation in near real-time is possible using an ADP with value function approximation and results in improved coverage of stochastic risk over the myopic approach or pre-scheduling.
Advisors/Committee Members: Ganesan, Rajesh (advisor).
Subjects/Keywords: diffusion wavelet;
approximate dynamic programming;
value function approximation;
sequential decision making under uncertainty
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
DeGregory, K. W. (2014). An Approximate Dynamic Program for Allocating Federal Air Marshals in Near Real-Time Under Uncertainty
. (Thesis). George Mason University. Retrieved from http://hdl.handle.net/1920/9012
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
DeGregory, Keith W. “An Approximate Dynamic Program for Allocating Federal Air Marshals in Near Real-Time Under Uncertainty
.” 2014. Thesis, George Mason University. Accessed March 04, 2021.
http://hdl.handle.net/1920/9012.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
DeGregory, Keith W. “An Approximate Dynamic Program for Allocating Federal Air Marshals in Near Real-Time Under Uncertainty
.” 2014. Web. 04 Mar 2021.
Vancouver:
DeGregory KW. An Approximate Dynamic Program for Allocating Federal Air Marshals in Near Real-Time Under Uncertainty
. [Internet] [Thesis]. George Mason University; 2014. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/1920/9012.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
DeGregory KW. An Approximate Dynamic Program for Allocating Federal Air Marshals in Near Real-Time Under Uncertainty
. [Thesis]. George Mason University; 2014. Available from: http://hdl.handle.net/1920/9012
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Maryland
14.
He, He.
SEQUENTIAL DECISIONS AND PREDICTIONS IN NATURAL LANGUAGE PROCESSING.
Degree: Computer Science, 2016, University of Maryland
URL: http://hdl.handle.net/1903/18580
► Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most…
(more)
▼ Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs and makes a one-time prediction. In this disser- tation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving. In these problems, predictions are often made based only on partial information. We see this dynamic setting in many real-time, interactive applications. These problems usually involve a trade-off between the amount of input received (cost) and the quality of the output prediction (accuracy). Therefore, the evaluation considers both objectives (e.g., plotting a Pareto curve).
Our goal is to develop a formal understanding of
sequential prediction and
decision-
making problems in natural language processing and to propose efficient solutions. Toward this end, we present meta-algorithms that take an existent batch model and produce a dynamic model to handle
sequential inputs and outputs. Webuild our framework upon theories of Markov
Decision Process (MDP), which allows learning to trade off competing objectives in a principled way. The main machine learning techniques we use are from imitation learning and reinforcement learning, and we advance current techniques to tackle problems arising in our settings. We evaluate our algorithm on a variety of applications, including dependency parsing, machine translation, and question answering. We show that our approach achieves a better cost-accuracy trade-off than the batch approach and heuristic-based
decision-
making approaches.
We first propose a general framework for cost-sensitive prediction, where dif- ferent parts of the input come at different costs. We formulate a
decision-
making process that selects pieces of the input sequentially, and the selection is adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing). We show that it achieves similar prediction quality to methods that use all input, while inducing a much smaller cost. Next, we extend the framework to problems where the input is revealed incremen- tally in a fixed order. We study two applications: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this set- ting and show that adding domain knowledge eases the
decision-
making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with
sequential input/output and trade-off decisions, accompanied by a learning algorithm that solves the MDP.
Advisors/Committee Members: Daume III, Hal (advisor).
Subjects/Keywords: Artificial intelligence; imitation learning; natural langauge processing; reinforcement learning; sequential decision-making
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
He, H. (2016). SEQUENTIAL DECISIONS AND PREDICTIONS IN NATURAL LANGUAGE PROCESSING. (Thesis). University of Maryland. Retrieved from http://hdl.handle.net/1903/18580
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
He, He. “SEQUENTIAL DECISIONS AND PREDICTIONS IN NATURAL LANGUAGE PROCESSING.” 2016. Thesis, University of Maryland. Accessed March 04, 2021.
http://hdl.handle.net/1903/18580.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
He, He. “SEQUENTIAL DECISIONS AND PREDICTIONS IN NATURAL LANGUAGE PROCESSING.” 2016. Web. 04 Mar 2021.
Vancouver:
He H. SEQUENTIAL DECISIONS AND PREDICTIONS IN NATURAL LANGUAGE PROCESSING. [Internet] [Thesis]. University of Maryland; 2016. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/1903/18580.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
He H. SEQUENTIAL DECISIONS AND PREDICTIONS IN NATURAL LANGUAGE PROCESSING. [Thesis]. University of Maryland; 2016. Available from: http://hdl.handle.net/1903/18580
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Georgia
15.
Qu, Xia.
Strategic behavior under uncertainty in multiagent settings.
Degree: 2015, University of Georgia
URL: http://hdl.handle.net/10724/31278
► Sequential decision making under uncertainty involves selecting a sequence of actions in the presence of noise to maximize an agent's expected utility. In multiagent settings,…
(more)
▼ Sequential decision making under uncertainty involves selecting a sequence of actions in the presence of noise to maximize an agent's expected utility. In multiagent settings, agents are uncertain not only about their actions' outcomes,
their observations, and states, but also about actions of other agents sharing the environment. Therefore, an agent's behavior must be strategic and consider these uncertainties. One recognized framework relevant for decision making in multiagent
settings is the interactive partially observable Markov decision process (I-POMDP). This research focuses on strategic behavior of humans and normative agents in multiagent settings. First, I study the behavior of humans in two classes of games and
propose several new models of behavioral data collected when humans engaged in these games. The first class is a modified Centipede game for testing human recursive thinking. Recent experiments show that humans predominantly reason at lower levels;
however, they display a higher level of reasoning if games are made simpler and more competitive. I model the data using the finitely-nested I-POMDP, appropriately simplified and augmented with models simulating human learning and choice. Results suggest
that this process-oriented behavioral modeling provides a good fit of the data. My modeling further showed that humans attribute the same errors that they themselves make to others. The second class pertains to sequential bargaining where humans are
widely observed as deviating from game-theoretic predictions. I construct a suite of new and existing computational process models that integrate different choice models with utility functions. Fairness and limited backward induction, both of which may
possibly explain the behavioral deviations, are incorporated. My comparative analyses reveal that limited backward induction plays a crucial role in longer-round games while in shorter-round games, fairness remains the key consideration. Second, I
present new methods for computing the strategic behavior of normative agents in the context of I-POMDPs. A new technique provides the first formalization of planning in finitely-nested I-POMDPs as a probabilistic inference problem. My comprehensive
experimental results demonstrate that we may obtain solutions represented as compact finite state controllers whose quality is significantly better than previous policy iteration techniques though convergence may take more time.
Subjects/Keywords: Human behavor modeling; Behavoral game theory; Sequential decision making under uncertainty; EM algorithm
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Qu, X. (2015). Strategic behavior under uncertainty in multiagent settings. (Thesis). University of Georgia. Retrieved from http://hdl.handle.net/10724/31278
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Qu, Xia. “Strategic behavior under uncertainty in multiagent settings.” 2015. Thesis, University of Georgia. Accessed March 04, 2021.
http://hdl.handle.net/10724/31278.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Qu, Xia. “Strategic behavior under uncertainty in multiagent settings.” 2015. Web. 04 Mar 2021.
Vancouver:
Qu X. Strategic behavior under uncertainty in multiagent settings. [Internet] [Thesis]. University of Georgia; 2015. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/10724/31278.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Qu X. Strategic behavior under uncertainty in multiagent settings. [Thesis]. University of Georgia; 2015. Available from: http://hdl.handle.net/10724/31278
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Texas – Austin
16.
-8992-4917.
Autonomous dynamic decision making in fuel cycle simulators using a game theoretic approach.
Degree: PhD, Mechanical Engineering, 2019, University of Texas – Austin
URL: http://dx.doi.org/10.26153/tsw/2834
► A novel methodology for optimizing nuclear fuel cycle transitions that captures interactions between a policy maker and electric utility company is presented. The methodology is…
(more)
▼ A novel methodology for optimizing nuclear fuel cycle transitions that captures interactions between a policy maker and electric utility company is presented. The methodology is demonstrated using a two-person general-sum
sequential game with uncertainty that is implemented using a nuclear fuel cycle simulator capable of calculating a material- and technology-constrained material balance, coupled to a multi-objective optimization solver. The solver explicitly treats uncertainties using a stochastic programming approach with chance nodes depicted as a Nature player who moves randomly. The methodology is demonstrated through a Transition Game that features tradeoffs between investments in competing reprocessing and waste disposal technologies, dynamic reactor deployment responses to resolutions in reactor capital cost uncertainty, and the influence of capital subsidies on the future nuclear technology mix. Each player in the game uses a unique set of
decision criteria to identify optimal near-term hedging strategies that consider all of Nature’s possible moves as well as the other player’s available decisions. These hedging strategies balance the exchange between the risk of immediate action and delay and maintain flexibility to allow for intelligent recourse decisions once uncertainties are resolved. Results from the Transition Game indicate that early transition to high-temperature gas-cooled reactors is preferred, with the option to abandon the transition following a learning period if capital costs are unfavorable. Under these conditions, transition to used fuel recycling in sodium-cooled fast reactors may be spurred by policy incentives under some certain
decision criteria weightings. Otherwise, operating with a baseline set of
decision criteria weightings, transition to a closed fuel is never observed when players hedge optimally against Nature’s moves. It is only when players have perfect information regarding Nature’s future moves will transition to a closed fuel be observed.
Advisors/Committee Members: Haas, Derek Anderson, 1981- (advisor), Leibowicz, Benjamin D. (advisor), Landsberger, Sheldon (committee member), Wilson, Paul (committee member).
Subjects/Keywords: Nuclear fuel cycle; Systems analysis; Sequential decision making under uncertainty; Game theory
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
-8992-4917. (2019). Autonomous dynamic decision making in fuel cycle simulators using a game theoretic approach. (Doctoral Dissertation). University of Texas – Austin. Retrieved from http://dx.doi.org/10.26153/tsw/2834
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Chicago Manual of Style (16th Edition):
-8992-4917. “Autonomous dynamic decision making in fuel cycle simulators using a game theoretic approach.” 2019. Doctoral Dissertation, University of Texas – Austin. Accessed March 04, 2021.
http://dx.doi.org/10.26153/tsw/2834.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
MLA Handbook (7th Edition):
-8992-4917. “Autonomous dynamic decision making in fuel cycle simulators using a game theoretic approach.” 2019. Web. 04 Mar 2021.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Vancouver:
-8992-4917. Autonomous dynamic decision making in fuel cycle simulators using a game theoretic approach. [Internet] [Doctoral dissertation]. University of Texas – Austin; 2019. [cited 2021 Mar 04].
Available from: http://dx.doi.org/10.26153/tsw/2834.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Council of Science Editors:
-8992-4917. Autonomous dynamic decision making in fuel cycle simulators using a game theoretic approach. [Doctoral Dissertation]. University of Texas – Austin; 2019. Available from: http://dx.doi.org/10.26153/tsw/2834
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete

Rutgers University
17.
Fedzhora, Liliya.
A linear programming model for sequential testing.
Degree: PhD, Operations Research, 2008, Rutgers University
URL: http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.17466
► In this study, a linear programming model is formulated that finds an optimal strategy for many decision-making problems that typically arise in homeland security, banking,…
(more)
▼ In this study, a linear programming model is formulated that finds an optimal strategy for many decision-making problems that typically arise in homeland security, banking, medicine, and engineering. We consider the problem of deploying a set of tests most effectively when the goal is to detect as many as possible "bad" objects among the vast majority of "good" ones, for example, in searching for contraband. The study assumes that functional dependency between test results and object type is unknown. The model finds an optimal testing strategy in the form of a decision tree that minimizes the expected cost given a detection rate or maximizes the detection rate given the budget. The mathematical basis for the model is a polyhedral description of all decision trees in higher dimensional space.
Decision trees are widely used in data mining and machine learning. A notion of VCdimension is used to evaluate the bounds of the sample size required for the learning model. For some classes of Boolean functions VC-dimension is already known, for example, for monomials and threshold functions. In Chapter 10, the VC-dimension of Horn functions is derived, and also the VC-dimension of a more general class of k-quasi-Horn functions. In Chapter 11 we state and prove a criterion for k-quasi-Horn functions that generalizes McKinsey's theorem. Also, necessary and sufficient conditions for function to be bidual k-quasi-Horn are stated and proved.
Advisors/Committee Members: Fedzhora, Liliya (author), Prekopa, Andras (chair), Boros, Endre (internal member), Gurvich, Vladimir (internal member), Kantor, Paul (internal member), Vizvari, Bela (outside member).
Subjects/Keywords: Decision making; Data mining; Sequential analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Fedzhora, L. (2008). A linear programming model for sequential testing. (Doctoral Dissertation). Rutgers University. Retrieved from http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.17466
Chicago Manual of Style (16th Edition):
Fedzhora, Liliya. “A linear programming model for sequential testing.” 2008. Doctoral Dissertation, Rutgers University. Accessed March 04, 2021.
http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.17466.
MLA Handbook (7th Edition):
Fedzhora, Liliya. “A linear programming model for sequential testing.” 2008. Web. 04 Mar 2021.
Vancouver:
Fedzhora L. A linear programming model for sequential testing. [Internet] [Doctoral dissertation]. Rutgers University; 2008. [cited 2021 Mar 04].
Available from: http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.17466.
Council of Science Editors:
Fedzhora L. A linear programming model for sequential testing. [Doctoral Dissertation]. Rutgers University; 2008. Available from: http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.17466

Freie Universität Berlin
18.
Green, Nikos.
Die Mechanismen des Entscheidungskriteriums.
Degree: 2013, Freie Universität Berlin
URL: http://dx.doi.org/10.17169/refubium-14155
► In dieser Arbeit habe ich Mechanismen des menschlichen Gehirns untersucht, die zur Anpassung des Entscheidungskriteriums führen. Bei Wahrnehmungsentscheidungen, wie z.B. dem Erkennen der Bewegungsrichtung eines…
(more)
▼ In dieser Arbeit habe ich Mechanismen des menschlichen Gehirns untersucht, die
zur Anpassung des Entscheidungskriteriums führen. Bei
Wahrnehmungsentscheidungen, wie z.B. dem Erkennen der Bewegungsrichtung eines
Objekts (z.B. nach links oder rechts), wird aufgrund theoretischer und
empirischer Befunde angenommen, dass sensorische Information kontinuierlich
bis zur Überschreitung eines Entscheidungskriteriums gesammelt wird. Dieser
Prozess lässt sich durch theoretische Modelle, sogenannte Akkumulatormodelle,
mathematisch beschreiben. Aufgrund umfangreicher Verhaltensdaten,
neurophysiologischer und bildgebender Befunde, wird angenommen, dass diese
Modelle den zugrundeliegenden Informationsverarbeitungsprozess im Gehirn sehr
gut abbilden. Ein zentraler Parameter dieser Modelle ist dabei das
Entscheidungskriterium, weil es den Entscheidungsprozess determiniert. In
dieser Dissertation werden zwei Projekte diskutiert, in denen neuronale
Mechanismen des Entscheidungskriteriums von mir untersucht wurden. Zentral im
ersten Projekt ist die Frage nach dem neuronalen Mechanismus der Anpassung des
Entscheidungskriteriums (engl.
decision threshold modulation). Ausgehend von
einem biophysikalischen Modell das den erwähnten Entscheidungsprozess in einem
Neuronalen Netz implementiert, untersuchte ich die Veränderbarkeit des
Entscheidungskriteriums. In dem Neuronalen Netz wird das
Entscheidungskriterium durch die Anpassung von Interaktionen zwischen
Kortikalen Akkumulator- (engl. integrator) und Striatalen Neuronen moduliert.
Unter Anwendung bildgebender Verfahren und komputationaler Modelle konnte ich
zeigen, dass das Entscheidungskriterium durch die Modulation von
Interaktionen, zwischen Hirnregionen, die für eine Entscheidung relevant sind,
angepasst wird. Akkumulatormodelle lassen sich theoretisch aus statistischen
Tests über optimales Entscheiden herleiten. Ausgehend von einem
komputationalen Modell über optimales Entscheiden in kortiko-
basalganglionischen Netzwerken, habe ich im zweiten Projekt untersucht, wie
der Nucleus subthalamicus (engl. STN, Subthalamic Nucleus), der in diesem
Modell eine zentrale Rolle bei Entscheidungen spielt, bei einfachen,
perzeptuellen Entscheidungen das Entscheidungskriterium beeinflusst. Dazu
nutzte ich die Möglichkeit der Tiefenhirnstimulation (engl. DBS, Deep Brain
Stimulation) des STN bei Parkinson Patienten. In dieser Studie konnte ich die
Modellannahmen über die Funktion des STN bestätigen. Ebenso konnte ich zeigen,
dass die DBS des STN die Modulation des Entscheidungskriteriums einschränkt.
Zusammengefasst zeigt meine Dissertation erstens, dass das
Entscheidungskriterium durch eine Veränderung in der Kopplung zwischen
Kortikalen und Subkortikalen Hirnsystemen moduliert wird und zweitens, das
diese Anpassung durch Signale des STN beeinflusst wird.
Advisors/Committee Members: [email protected] (contact), m (gender), Prof. Dr. Shu Chen Li (inspector), Prof. Dr. Arthur Jacobs (inspector), Prof. Dr. Hauke Heekeren (firstReferee), Prof Dr. Andrea Kühn (furtherReferee).
Subjects/Keywords: decision making; perception; decision threshold; sequential sampling; 100 Philosophie und Psychologie::150 Psychologie; 500 Naturwissenschaften und Mathematik::570 Biowissenschaften; Biologie
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Green, N. (2013). Die Mechanismen des Entscheidungskriteriums. (Thesis). Freie Universität Berlin. Retrieved from http://dx.doi.org/10.17169/refubium-14155
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Green, Nikos. “Die Mechanismen des Entscheidungskriteriums.” 2013. Thesis, Freie Universität Berlin. Accessed March 04, 2021.
http://dx.doi.org/10.17169/refubium-14155.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Green, Nikos. “Die Mechanismen des Entscheidungskriteriums.” 2013. Web. 04 Mar 2021.
Vancouver:
Green N. Die Mechanismen des Entscheidungskriteriums. [Internet] [Thesis]. Freie Universität Berlin; 2013. [cited 2021 Mar 04].
Available from: http://dx.doi.org/10.17169/refubium-14155.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Green N. Die Mechanismen des Entscheidungskriteriums. [Thesis]. Freie Universität Berlin; 2013. Available from: http://dx.doi.org/10.17169/refubium-14155
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Pennsylvania
19.
Luu, Long.
Self-Consistency In Sequential Decision-Making.
Degree: 2018, University of Pennsylvania
URL: https://repository.upenn.edu/edissertations/3152
► Human decisions are rarely made in isolation. We typically have to make a sequence of decisions to reach a goal. Studies in economics and cognitive…
(more)
▼ Human decisions are rarely made in isolation. We typically have to make a sequence of decisions to reach a goal. Studies in economics and cognitive psychology have shown that making a decision may result in several biases in subsequent judgments. Similar biases have also recently been found in human percepts of low-level stimuli such as motion direction. What lacking is a principled framework that can account for several sequential dependencies between judgments. Towards that goal, in my thesis, I propose and experimentally test a self-consistent Bayesian observer model that assumes humans maintain self-consistency along the inference process. In Chapter 2, I first demonstrate that after having made a categorical decision on stimulus orientation, subjects’ estimate of the stimulus is systematically biased away from the decision boundary. Two additional experiments suggest that the bias occurs because subjects treat their first decision as a fact and use that to constrain the subsequent estimation. Model fit to the data in my experiments and data in previous studies show that the self-consistent Bayesian model can quantitatively account for human behaviors in a wide range of experimental settings. In Chapter 3, using the same decision-estimation tasks, I probed the post-decision sensory representation by providing feedback on the categorical decision. I found that subjects’ sensory representation is kept intact and the self-consistency is implemented by conditioning the prior distribution on the categorical decision. The results also suggest another interesting form of self-consistency when subjects’ decision was incorrect: they reconstructed the sensory measurement to make it consistent with the given feedback. In Chapter 4, I found that the choice-induced bias also occurs in human judgment of number. The bias is similar for both non-symbolic (cloud of dots) and symbolic (sequence of Arabic numerals) forms of number. Finally, I propose in the general discussion how the self-consistent Bayesian framework may account for other biases in sequential decision-making such as the halo effect and sunk-cost fallacy.
Subjects/Keywords: Bayesian decision theory; Decision-making; Perception; Psychophysics; Self-consistency; Sequential judgment; Neuroscience and Neurobiology; Psychology; Social and Behavioral Sciences
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Luu, L. (2018). Self-Consistency In Sequential Decision-Making. (Thesis). University of Pennsylvania. Retrieved from https://repository.upenn.edu/edissertations/3152
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Luu, Long. “Self-Consistency In Sequential Decision-Making.” 2018. Thesis, University of Pennsylvania. Accessed March 04, 2021.
https://repository.upenn.edu/edissertations/3152.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Luu, Long. “Self-Consistency In Sequential Decision-Making.” 2018. Web. 04 Mar 2021.
Vancouver:
Luu L. Self-Consistency In Sequential Decision-Making. [Internet] [Thesis]. University of Pennsylvania; 2018. [cited 2021 Mar 04].
Available from: https://repository.upenn.edu/edissertations/3152.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Luu L. Self-Consistency In Sequential Decision-Making. [Thesis]. University of Pennsylvania; 2018. Available from: https://repository.upenn.edu/edissertations/3152
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Delft University of Technology
20.
Moerland, T.M.
The Intersection of Planning and Learning.
Degree: 2021, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418
;
urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418
;
5437884e-0078-4b36-b2c7-c6edfea3b418
;
10.4233/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418
;
urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418
;
http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418
► Intelligent sequential decision making is a key challenge in artificial intelligence. The problem, commonly formalized as a Markov Decision Process, is studied in two different…
(more)
▼ Intelligent
sequential decision making is a key challenge in artificial intelligence. The problem, commonly formalized as a Markov
Decision Process, is studied in two different research communities: planning and reinforcement learning. Departing from a fundamentally different assumption about the type of access to the environment, both research fields have developed their own solution approaches and conventions. The combination of both fields, known as model-based reinforcement learning, has recently shown state-of-the-art results, for example defeating human experts in classic board games like Chess and Go. Nevertheless, literature lacks an integrated view on 1) the similarities between planning and learning, and 2) the possible combinations of both. This dissertation aims to fill this gap. The first half of the book presents a conceptual answer to both questions. We first present a framework that disentangles the common algorithmic space of both fields, showing that they essentially face the same algorithmic design decisions. Moreover, we also present an overview of the different ways in which planning and learning can be combined in one algorithm. The second half of the dissertation provides experimental illustration of these ideas. We present several new combinations of planning and learning, such as a flexible method to learn stochastic dynamics models with neural networks, an extension of a successful planning-learning algorithm (AlphaZero) to deal with continuous action spaces, and a study of the empirical trade-off between planning and learning. Finally, we also illustrate the commonalities between both fields, by designing a new algorithm in one field based on inspiration from the other field. We conclude the thesis with an outlook for the planning-learning field as a whole. Altogether, the dissertation provides a broad theoretical and empirical view on the combination of planning and learning, which promises to be an important frontier in artificial intelligence research in the coming years.
Advisors/Committee Members: Jonker, C.M., Plaat, Aske, Broekens, D.J., Delft University of Technology.
Subjects/Keywords: Planning; Reinforcement learning; Sequential decision making; Markov decision process
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Moerland, T. M. (2021). The Intersection of Planning and Learning. (Doctoral Dissertation). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; 5437884e-0078-4b36-b2c7-c6edfea3b418 ; 10.4233/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418
Chicago Manual of Style (16th Edition):
Moerland, T M. “The Intersection of Planning and Learning.” 2021. Doctoral Dissertation, Delft University of Technology. Accessed March 04, 2021.
http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; 5437884e-0078-4b36-b2c7-c6edfea3b418 ; 10.4233/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418.
MLA Handbook (7th Edition):
Moerland, T M. “The Intersection of Planning and Learning.” 2021. Web. 04 Mar 2021.
Vancouver:
Moerland TM. The Intersection of Planning and Learning. [Internet] [Doctoral dissertation]. Delft University of Technology; 2021. [cited 2021 Mar 04].
Available from: http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; 5437884e-0078-4b36-b2c7-c6edfea3b418 ; 10.4233/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418.
Council of Science Editors:
Moerland TM. The Intersection of Planning and Learning. [Doctoral Dissertation]. Delft University of Technology; 2021. Available from: http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; 5437884e-0078-4b36-b2c7-c6edfea3b418 ; 10.4233/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; urn:NBN:nl:ui:24-uuid:5437884e-0078-4b36-b2c7-c6edfea3b418 ; http://resolver.tudelft.nl/uuid:5437884e-0078-4b36-b2c7-c6edfea3b418

Australian National University
21.
Kinathil, Shamin.
Closed-form Solutions to Sequential Decision Making within Markets
.
Degree: 2018, Australian National University
URL: http://hdl.handle.net/1885/186490
► Sequential decision making is a pervasive and inescapable requirement of every day life. Deciding upon which sequence of actions to take is complicated by incomplete…
(more)
▼ Sequential decision making is a pervasive and
inescapable requirement of every day life. Deciding upon which
sequence of actions to take is complicated by incomplete
information about the environment, the effects of each decision
upon the future state of the environment, ill-defined objectives
and our own cognitive limitations. These challenges are
exacerbated in financial markets which are in a constant state of
flux, with prices adjusting to new information, winning traders
replacing losing traders and the introduction of new
technologies. Decision theoretic planning provides powerful and
flexible frameworks for many sequential decision making problems
within financial markets. Markov Decision Processes (MDPs) and
Partially Observable Markov Decision Processes (POMDPs) are two
such frameworks, which can model financial decision making
problems with either or both discrete and continuous variables.
Traditional methods to solve MDPs and POMDPs use Monte-Carlo
based methods or discretise the continuous state space. This is
often to the detriment of the quality of the solution, which in
many financial domains, such as those involving asset prices, are
required to be exact and closed-form. Closed-form solutions, by
virtue of being declarative and symbolic, allow the decision
maker to undertake further analysis and optimisation. In this
thesis we present novel techniques to calculate exact and
closed-form solutions to new classes of MDPs and POMDPs by
leveraging Symbolic Dynamic Programming (SDP). Our specific
contributions include: (i) closed-form solutions to a subclass of
Continuous Stochastic Games, which we use to encapsulate and
solve a zero-sum game between a Binary Option trader and an
adversarial market; (ii) closed-form solutions to Sequential
Market Making with Inventory, which extends seminal works from
market microstructure theory and algorithmic market-making; and
(iii) Analytic Decision Analysis for Parameterised Hybrid MDPs,
which we use to examine the sensitivity of a model for optimal
trans- action execution to its parameters. Our experimental
evaluations confirm the efficacy of our novel solutions for a
range of new classes of MDPs and POMDPs.
Subjects/Keywords: sequential decision making;
markov decision processes;
partially observable markov decision processes;
closed-form solutions;
market making;
continuous;
stochastic games;
hybrid;
mdp;
pomdp;
symbolic dynamic programming;
sdp;
dynamic programming
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Kinathil, S. (2018). Closed-form Solutions to Sequential Decision Making within Markets
. (Thesis). Australian National University. Retrieved from http://hdl.handle.net/1885/186490
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Kinathil, Shamin. “Closed-form Solutions to Sequential Decision Making within Markets
.” 2018. Thesis, Australian National University. Accessed March 04, 2021.
http://hdl.handle.net/1885/186490.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Kinathil, Shamin. “Closed-form Solutions to Sequential Decision Making within Markets
.” 2018. Web. 04 Mar 2021.
Vancouver:
Kinathil S. Closed-form Solutions to Sequential Decision Making within Markets
. [Internet] [Thesis]. Australian National University; 2018. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/1885/186490.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Kinathil S. Closed-form Solutions to Sequential Decision Making within Markets
. [Thesis]. Australian National University; 2018. Available from: http://hdl.handle.net/1885/186490
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Hawaii – Manoa
22.
Peterson, Henry Howard.
Analytic solutions to small scale two level programs with applications to the United States Department of Agriculture grain commodities programs.
Degree: PhD, 2009, University of Hawaii – Manoa
URL: http://hdl.handle.net/10125/9209
► Binder's title on spine: United States Department of Agriculture grain commodities programs.
Typescript.
Bibliography: leaves 100-102.
Photocopy.
x, 102 leaves 29 cm
The two level…
(more)
▼ Binder's title on spine: United States Department of Agriculture grain commodities programs.
Typescript.
Bibliography: leaves 100-102.
Photocopy.
x, 102 leaves 29 cm
The two level program is a model of a two stage, sequential decision making process involving two decision makers. Neither decision maker has control over all of the variables. The level one decisions are made first. Next, the level two decisions are made using the level one decisions as exogenous data. Various interactions can occur between the two levels and it is thus necessary for the level one decision makers to include in their decision making the possible reactions at level two. The optimal solution simultaneously satisfies both the level one program and the subprogram at level two. Published research to date has been primarily concerned with the search for efficient computer algorithms to two level programs. The main purpose of this research is to investigate analytic solutions to complex, nonlinear programs that are beyond solution by typical computer algorithms. The methodology derived is based on the well known theorems for the necessary and sufficient conditions for the solution of nonlinear programs by Kuhn-Tucker [1951] and by Arrow-Enthoven [1961]. First, the level two solution is derived as a function of the level one decision variables. Next, the level one solution is derived having incorporated the level two solutions. A nonlinear, two commodity model is formulated of the United states Department of Agriculture (USDA) grain commodities support program. Linear and nonlinear price and demand functions are explicitly included as functions of acreage withheld from production under the program. Other inputs include direct and cross elasticites of supply. Input data is from United States government publications. Methodology is derived for estimating the parameters of the price and demand functions. The results of the study indicate the feasibility and usefulness of using analytic methods to solve complex, nonlinear two level programs. Applications to the USDA's grain commodities program provided realistic projections of commodity prices, carryovers and program costs. Similar research should be conducted in other areas involving sequential decision making. The study should be useful as a reference in classroom studies.
Subjects/Keywords: Decision making – Mathematical models; Sequential analysis; Grain trade – Decision making – Mathematical models
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Peterson, H. H. (2009). Analytic solutions to small scale two level programs with applications to the United States Department of Agriculture grain commodities programs. (Doctoral Dissertation). University of Hawaii – Manoa. Retrieved from http://hdl.handle.net/10125/9209
Chicago Manual of Style (16th Edition):
Peterson, Henry Howard. “Analytic solutions to small scale two level programs with applications to the United States Department of Agriculture grain commodities programs.” 2009. Doctoral Dissertation, University of Hawaii – Manoa. Accessed March 04, 2021.
http://hdl.handle.net/10125/9209.
MLA Handbook (7th Edition):
Peterson, Henry Howard. “Analytic solutions to small scale two level programs with applications to the United States Department of Agriculture grain commodities programs.” 2009. Web. 04 Mar 2021.
Vancouver:
Peterson HH. Analytic solutions to small scale two level programs with applications to the United States Department of Agriculture grain commodities programs. [Internet] [Doctoral dissertation]. University of Hawaii – Manoa; 2009. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/10125/9209.
Council of Science Editors:
Peterson HH. Analytic solutions to small scale two level programs with applications to the United States Department of Agriculture grain commodities programs. [Doctoral Dissertation]. University of Hawaii – Manoa; 2009. Available from: http://hdl.handle.net/10125/9209
23.
Pinto, Jervis.
Incorporating and Learning Behavior Constraints for Sequential Decision Making.
Degree: PhD, Computer Science, 2015, Oregon State University
URL: http://hdl.handle.net/1957/56129
► Writing a program that performs well in a complex environment is a challenging task. In such problems, a method of deterministic programming combined with reinforcement…
(more)
▼ Writing a program that performs well in a complex environment is a challenging task. In such problems, a method of deterministic programming combined with reinforcement learning (RL) can be helpful. However, current systems either force developers to encode knowledge in very specific forms (e.g., state-action features), or assume advanced RL knowledge (e.g., ALISP).
This thesis explores techniques that make it easier for developers, who may not be RL experts, to encode their knowledge in the form of behavior constraints. We begin with the framework of adaptation-based programming (ABP) for writing self-optimizing programs. Next, we show how a certain type of conditional independency called "influence information" arises naturally in ABP programs. We propose two algorithms for learning reactive policies that are capable of leveraging this knowledge. Using influence information to simplify the credit assignment problem produces significant performance improvements.
Next, we turn our attention to problems in which a simulator allows us to replace reactive
decision-
making with time-bounded search, which often outperforms purely reactive
decision-
making at significant computational cost. We propose a new type of behavior constraint in the form of partial policies, which restricts behavior to a subset of good actions. Using a partial policy to prune sub-optimal actions reduces the action branching factor, thereby speeding up search. We propose three algorithms for learning partial policies offline, based on reducing the learning problem to i.i.d. supervised learning and we give a reduction-style analysis for each one. We give concrete implementations using the popular framework of Monte-Carlo tree search. Experiments on challenging problems demonstrates large performance improvements in search-based
decision-
making generated by the learned partial policies.
Taken together, this thesis outlines a programming framework for injecting different forms of developer knowledge into reactive policy learning algorithms and search-based online planning algorithms. It represents a few small steps towards a programming paradigm that makes it easy to write programs that learn to perform well.
Advisors/Committee Members: Fern, Alan P. (advisor), Tadepalli, Prasad (committee member).
Subjects/Keywords: online sequential decision making; Reinforcement learning
…behavior with experience. This framework, called sequential
decision-making under uncertainty… …decision-making. In this thesis, we study sequential decision
problems in the framework of… …Lookahead (Tree) Search
This type of sequential decision making does not involve any… …methods for sequential decision making more accessible to developers, hopefully
making it easier… …Sequential Decision Problems . . . . . . . . . . .
1.2.1 Markov Decision Problems (MDPs)…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Pinto, J. (2015). Incorporating and Learning Behavior Constraints for Sequential Decision Making. (Doctoral Dissertation). Oregon State University. Retrieved from http://hdl.handle.net/1957/56129
Chicago Manual of Style (16th Edition):
Pinto, Jervis. “Incorporating and Learning Behavior Constraints for Sequential Decision Making.” 2015. Doctoral Dissertation, Oregon State University. Accessed March 04, 2021.
http://hdl.handle.net/1957/56129.
MLA Handbook (7th Edition):
Pinto, Jervis. “Incorporating and Learning Behavior Constraints for Sequential Decision Making.” 2015. Web. 04 Mar 2021.
Vancouver:
Pinto J. Incorporating and Learning Behavior Constraints for Sequential Decision Making. [Internet] [Doctoral dissertation]. Oregon State University; 2015. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/1957/56129.
Council of Science Editors:
Pinto J. Incorporating and Learning Behavior Constraints for Sequential Decision Making. [Doctoral Dissertation]. Oregon State University; 2015. Available from: http://hdl.handle.net/1957/56129

University of Technology, Sydney
24.
Fang, M.
Bandit learning for sequential decision making : a practical way to address the trade-off between exploration and exploitation.
Degree: 2015, University of Technology, Sydney
URL: http://hdl.handle.net/10453/39169
► The sequential decision making is to actively acquire information and then make decisions in large uncertain options, such as recommendation systems and the Internet. The…
(more)
▼ The sequential decision making is to actively acquire information and then make decisions in large uncertain options, such as recommendation systems and the Internet. The sequential decision becomes challenging since the feedback is often partially observed. In this thesis we propose new algorithms of “bandit learning”, whose basic idea is to address the fundamental trade-off between exploration and exploitation in sequence. The goal of bandit learning algorithms is to maximize some objective when making decision. We study several novel methodologies for different scenarios, such as social networks, multi-view, multi-task, repeated labeling and active learning. We formalize these adaptive problems as sequential decision making for different real applications. We present several new insights into these popular problems from the perspective of bandit. We address the trade-off between exploration and exploitation using a bandit framework.
In particular, we introduce “networked bandits” to model the multi-armed bandits with correlations, which exist in social networks. The “networked bandits” is a new bandit model that considers a set of interrelated arms varying over time and selecting an arm invokes the other arms. The objective is still to obtain the best cumulative payoffs. We propose a method that considers both the arm and its relationships between arms. The proposed method selects an arm according to the integrated confidence sets constructed from historical data.
We study the problem of view selection in stream-based multi-view learning, where each view is obtained from a feature generator or source and is embedded in a reproducing kernel Hilbert space (RKHS). We propose an algorithm that selects a near-optimal subset of m views of n views and then makes the prediction based on the subset. To address this problem, we define the multi-view simple regret and study an upper bound of the expected regret for our algorithm. The proposed algorithm relies on the Rademacher complexity of the co-regularized kernel classes.
We address an active learning scenario in the multi-task learning problem. Considering that labeling effective instances across different tasks may improve the generalization error of all tasks, we propose a new active multi-task learning algorithm based on the multi-armed bandits for effectively selecting instances. The proposed algorithm can balance the trade-off between exploration and exploitation by considering both the risk of multi-task learner and the corresponding confidence bounds.
We study a popular annotation problem in crowdsourcing systems: repeated labeling. We introduce a new framework that actively selects the labeling tasks when facing a large number of labeling tasks. The objective is to identify the best labeling tasks from these noisy labeling tasks. We formalize the selection of repeated labeling tasks as a bandit framework. We consider a labeling task as an arm and the quality of a labeling task as the payoff. We introduce the definition of ε-optimal labeling task and use it to…
Subjects/Keywords: Sequential decision making.; “Bandit learning”.; “Networked bandits”.; Stream-based multi-view learning.; Simple repeated labeling strategy.
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Fang, M. (2015). Bandit learning for sequential decision making : a practical way to address the trade-off between exploration and exploitation. (Thesis). University of Technology, Sydney. Retrieved from http://hdl.handle.net/10453/39169
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Fang, M. “Bandit learning for sequential decision making : a practical way to address the trade-off between exploration and exploitation.” 2015. Thesis, University of Technology, Sydney. Accessed March 04, 2021.
http://hdl.handle.net/10453/39169.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Fang, M. “Bandit learning for sequential decision making : a practical way to address the trade-off between exploration and exploitation.” 2015. Web. 04 Mar 2021.
Vancouver:
Fang M. Bandit learning for sequential decision making : a practical way to address the trade-off between exploration and exploitation. [Internet] [Thesis]. University of Technology, Sydney; 2015. [cited 2021 Mar 04].
Available from: http://hdl.handle.net/10453/39169.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Fang M. Bandit learning for sequential decision making : a practical way to address the trade-off between exploration and exploitation. [Thesis]. University of Technology, Sydney; 2015. Available from: http://hdl.handle.net/10453/39169
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
25.
Hadoux, Emmanuel.
Markovian sequential decision-making in non-stationary environments : application to argumentative debates : Décision séquentielle markovienne en environnements non-stationnaires : application aux débats d'argumentation.
Degree: Docteur es, Informatique, 2015, Université Pierre et Marie Curie – Paris VI
URL: http://www.theses.fr/2015PA066489
► Les problèmes de décision séquentielle dans l’incertain requièrent qu’un agent prenne des décisions, les unes après les autres, en fonction de l’état de l’environnement dans…
(more)
▼ Les problèmes de décision séquentielle dans l’incertain requièrent qu’un agent prenne des décisions, les unes après les autres, en fonction de l’état de l’environnement dans lequel il se trouve. Dans la plupart des travaux, l’environnement dans lequel évolue l’agent est supposé stationnaire, c’est-à-dire qu’il n’évolue pas avec le temps. Toute- fois, l’hypothèse de stationnarité peut ne pas être vérifiée quand, par exemple, des évènements exogènes au problème interviennent. Dans cette thèse, nous nous intéressons à la prise de décision séquentielle dans des environnements non-stationnaires. Nous proposons un nouveau modèle appelé HS3MDP permettant de représenter les problèmes non-stationnaires dont les dynamiques évoluent parmi un ensemble fini de contextes. Afin de résoudre efficacement ces problèmes, nous adaptons l’algorithme POMCP aux HS3MDP. Dans le but d’apprendre les dynamiques des problèmes de cette classe, nous présentons RLCD avec SCD, une méthode utilisable sans connaître à priori le nombre de contextes. Nous explorons ensuite le domaine de l’argumentation où peu de travaux se sont intéressés à la décision séquentielle. Nous étudions deux types de problèmes : les débats stochastiques (APS ) et les problèmes de médiation face à des agents non-stationnaires (DMP). Nous présentons dans ce travail un modèle formalisant les APS et permettant de les transformer en MOMDP afin d’optimiser la séquence d’arguments d’un des agents du débat. Nous étendons cette modélisation aux DMP afin de permettre à un médiateur de répartir stratégiquement la parole dans un débat.
In sequential decision-making problems under uncertainty, an agent makes decisions, one after another, considering the current state of the environment where she evolves. In most work, the environment the agent evolves in is assumed to be stationary, i.e., its dynamics do not change over time. However, the stationarity hypothesis can be invalid if, for instance, exogenous events can occur. In this document, we are interested in sequential decision-making in non-stationary environments. We propose a new model named HS3MDP, allowing us to represent non-stationary problems whose dynamics evolve among a finite set of contexts. In order to efficiently solve those problems, we adapt the POMCP algorithm to HS3MDPs. We also present RLCD with SCD, a new method to learn the dynamics of the environments, without knowing a priori the number of contexts. We then explore the field of argumentation problems, where few works consider sequential decision-making. We address two types of problems: stochastic debates (APS ) and mediation problems with non-stationary agents (DMP). In this work, we present a model formalizing APS and allowing us to transform them into an MOMDP in order to optimize the sequence of arguments of one agent in the debate. We then extend this model to DMPs to allow a mediator to strategically organize speak-turns in a debate.
Advisors/Committee Members: Maudet, Nicolas (thesis director).
Subjects/Keywords: Intelligence artificielle; Décisions séquentielles; Modèles markoviens; Planification; Argumentation; Environnements non-Stationnaires; Artificial intelligence; Sequential decision-making under uncertainty; 004
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Hadoux, E. (2015). Markovian sequential decision-making in non-stationary environments : application to argumentative debates : Décision séquentielle markovienne en environnements non-stationnaires : application aux débats d'argumentation. (Doctoral Dissertation). Université Pierre et Marie Curie – Paris VI. Retrieved from http://www.theses.fr/2015PA066489
Chicago Manual of Style (16th Edition):
Hadoux, Emmanuel. “Markovian sequential decision-making in non-stationary environments : application to argumentative debates : Décision séquentielle markovienne en environnements non-stationnaires : application aux débats d'argumentation.” 2015. Doctoral Dissertation, Université Pierre et Marie Curie – Paris VI. Accessed March 04, 2021.
http://www.theses.fr/2015PA066489.
MLA Handbook (7th Edition):
Hadoux, Emmanuel. “Markovian sequential decision-making in non-stationary environments : application to argumentative debates : Décision séquentielle markovienne en environnements non-stationnaires : application aux débats d'argumentation.” 2015. Web. 04 Mar 2021.
Vancouver:
Hadoux E. Markovian sequential decision-making in non-stationary environments : application to argumentative debates : Décision séquentielle markovienne en environnements non-stationnaires : application aux débats d'argumentation. [Internet] [Doctoral dissertation]. Université Pierre et Marie Curie – Paris VI; 2015. [cited 2021 Mar 04].
Available from: http://www.theses.fr/2015PA066489.
Council of Science Editors:
Hadoux E. Markovian sequential decision-making in non-stationary environments : application to argumentative debates : Décision séquentielle markovienne en environnements non-stationnaires : application aux débats d'argumentation. [Doctoral Dissertation]. Université Pierre et Marie Curie – Paris VI; 2015. Available from: http://www.theses.fr/2015PA066489

Penn State University
26.
Virani, Nurali.
Learning Data-driven Models for Decision-making in Intelligent Physical Systems.
Degree: 2017, Penn State University
URL: https://submit-etda.libraries.psu.edu/catalog/13813nnv105
► Intelligent physical systems use machine learning for a variety of tasks from health monitoring to control. As the dependence on autonomous decision-making agents increases, it…
(more)
▼ Intelligent physical systems use machine learning for a variety of tasks from health monitoring to control. As the dependence on autonomous
decision-
making agents increases, it is of importance to understand and quantify the uncertainty associated with the decisions from machine learning frameworks. In order to facilitate the interaction with human agents (e.g., maintenance engineers and medical doctors) as well as to enable robust control for safety (e.g., autonomous navigation and sensor network adaptation), density estimation enables quantification of uncertainty in the output of a learning framework. In statistical learning, density estimation is a core problem, where the objective is to identify the underlying distribution from which the data are being generated. In this work, density estimation is established as a practical tool for data-driven modeling. A new and simple technique for density estimation is developed using concepts from statistical learning and optimization theory. Along with detection, classification, estimation, and tracking, which are crucial in learning and control, these models can also quantify uncertainty in their outputs.
This dissertation uses density estimation for developing new methods to solve practical problems of learning and
decision-
making. A few restrictive assumptions have been eliminated from these problems, yet tractable and accurate methods have been developed in this research. Specifically, in the
sequential classification problem, the naive Bayes' assumption of conditional independence between measurements, given state, is relaxed. A novel technique to learn a unified context from multi-modal sensor data is developed. This knowledge of context is used to achieve tractable and accurate multi-modal sensor fusion, which cannot be achieved using the naive Bayes' assumption. Additionally, the context-aware measurement models are also used for unifying state estimation and dynamic sensor selection problems in a stochastic control framework. In
sequential hypothesis testing with streaming data, the assumption that the observation sequence is independent and identically distributed (IID) has been removed by developing
sequential tests for Markov models of time-series data. Further, density estimation has been used to create Markov models from multidimensional time-series data by developing a unified formulation for alphabet-size selection and measurement-space partitioning. In
sequential tracking, the assumption of additive Gaussian noise has been eliminated by learning nonparametric density estimation-based measurement models, which can capture all the uncertainties in a given set of data. These measurement models have been used for state estimation and tracking with particle filters. In a
sequential measurement model learning setting, the labels provided by instructors are allowed to be incorrect as the assumption of the instructor being perfect has not been used. A recursive density estimation algorithm has been developed and analyzed to show that correct models can be…
Advisors/Committee Members: Asok Ray, Dissertation Advisor/Co-Advisor, Asok Ray, Committee Chair/Co-Chair, Sean Brennan, Dissertation Advisor/Co-Advisor, Shashi Phoha, Committee Chair/Co-Chair, Minghui Zhu, Outside Member, Ji-Woong Lee, Special Member.
Subjects/Keywords: data-driven modeling; statistical learning; density estimation; context learning; context-aware decision-making; pattern classification; multi-modal sensor fusion; sequential hypothesis testing; sequential learning; dynamic sensor selection; intelligent systems
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Virani, N. (2017). Learning Data-driven Models for Decision-making in Intelligent Physical Systems. (Thesis). Penn State University. Retrieved from https://submit-etda.libraries.psu.edu/catalog/13813nnv105
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Virani, Nurali. “Learning Data-driven Models for Decision-making in Intelligent Physical Systems.” 2017. Thesis, Penn State University. Accessed March 04, 2021.
https://submit-etda.libraries.psu.edu/catalog/13813nnv105.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Virani, Nurali. “Learning Data-driven Models for Decision-making in Intelligent Physical Systems.” 2017. Web. 04 Mar 2021.
Vancouver:
Virani N. Learning Data-driven Models for Decision-making in Intelligent Physical Systems. [Internet] [Thesis]. Penn State University; 2017. [cited 2021 Mar 04].
Available from: https://submit-etda.libraries.psu.edu/catalog/13813nnv105.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Virani N. Learning Data-driven Models for Decision-making in Intelligent Physical Systems. [Thesis]. Penn State University; 2017. Available from: https://submit-etda.libraries.psu.edu/catalog/13813nnv105
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
27.
Shirota Filho, Ricardo.
Processos de decisão Markovianos com probabilidades imprecisas e representações relacionais: algoritmos e fundamentos.
Degree: PhD, Engenharia de Controle e Automação Mecânica, 2012, University of São Paulo
URL: http://www.teses.usp.br/teses/disponiveis/3/3152/tde-13062013-160912/
;
► Este trabalho é dedicado ao desenvolvimento teórico e algorítmico de processos de decisão markovianos com probabilidades imprecisas e representações relacionais. Na literatura, essa configuração tem…
(more)
▼ Este trabalho é dedicado ao desenvolvimento teórico e algorítmico de processos de decisão markovianos com probabilidades imprecisas e representações relacionais. Na literatura, essa configuração tem sido importante dentro da área de planejamento em inteligência artificial, onde o uso de representações relacionais permite obter descrições compactas, e o emprego de probabilidades imprecisas resulta em formas mais gerais de incerteza. São três as principais contribuições deste trabalho. Primeiro, efetua-se uma discussão sobre os fundamentos de tomada de decisão sequencial com probabilidades imprecisas, em que evidencia-se alguns problemas ainda em aberto. Esses resultados afetam diretamente o (porém não restrito ao) modelo de interesse deste trabalho, os processos de decisão markovianos com probabilidades imprecisas. Segundo, propõe-se três algoritmos para processos de decisão markovianos com probabilidades imprecisas baseadas em programação (otimização) matemática. E terceiro, desenvolvem-se ideias propostas por Trevizan, Cozman e de Barros (2008) no uso de variantes do algoritmo Real-Time Dynamic Programming para resolução de problemas de planejamento probabilístico descritos através de versões estendidas da linguagem de descrição de domínios de planejamento (PPDDL).
This work is devoted to the theoretical and algorithmic development of Markov Decision Processes with Imprecise Probabilities and relational representations. In the literature, this configuration is important within artificial intelligence planning, where the use of relational representations allow compact representations and imprecise probabilities result in a more general form of uncertainty. There are three main contributions. First, we present a brief discussion of the foundations of decision making with imprecise probabilities, pointing towards key questions that remain unanswered. These results have direct influence upon the model discussed within this text, that is, Markov Decision Processes with Imprecise Probabilities. Second, we propose three algorithms for Markov Decision Processes with Imprecise Probabilities based on mathematical programming. And third, we develop ideas proposed by Trevizan, Cozman e de Barros (2008) on the use of variants of Real-Time Dynamic Programming to solve problems of probabilistic planning described by an extension of the Probabilistic Planning Domain Definition Language (PPDDL).
Advisors/Committee Members: Cozman, Fabio Gagliardi.
Subjects/Keywords: Algorithm; Algoritmos; Foundations; Fundamentos; Imprecise probabilities; Markov decision process; Probabilidades imprecisas; Processo de decisão Markoviano; Relational representations; Representações relacionais; Sequential decision making; Tomada de decisão sequencial
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Shirota Filho, R. (2012). Processos de decisão Markovianos com probabilidades imprecisas e representações relacionais: algoritmos e fundamentos. (Doctoral Dissertation). University of São Paulo. Retrieved from http://www.teses.usp.br/teses/disponiveis/3/3152/tde-13062013-160912/ ;
Chicago Manual of Style (16th Edition):
Shirota Filho, Ricardo. “Processos de decisão Markovianos com probabilidades imprecisas e representações relacionais: algoritmos e fundamentos.” 2012. Doctoral Dissertation, University of São Paulo. Accessed March 04, 2021.
http://www.teses.usp.br/teses/disponiveis/3/3152/tde-13062013-160912/ ;.
MLA Handbook (7th Edition):
Shirota Filho, Ricardo. “Processos de decisão Markovianos com probabilidades imprecisas e representações relacionais: algoritmos e fundamentos.” 2012. Web. 04 Mar 2021.
Vancouver:
Shirota Filho R. Processos de decisão Markovianos com probabilidades imprecisas e representações relacionais: algoritmos e fundamentos. [Internet] [Doctoral dissertation]. University of São Paulo; 2012. [cited 2021 Mar 04].
Available from: http://www.teses.usp.br/teses/disponiveis/3/3152/tde-13062013-160912/ ;.
Council of Science Editors:
Shirota Filho R. Processos de decisão Markovianos com probabilidades imprecisas e representações relacionais: algoritmos e fundamentos. [Doctoral Dissertation]. University of São Paulo; 2012. Available from: http://www.teses.usp.br/teses/disponiveis/3/3152/tde-13062013-160912/ ;
28.
YE HAN.
ESSAYS ON PHYSICIAN DECISION MAKING AND HEALTHCARE PROVIDER EFFICIENCY.
Degree: 2020, National University of Singapore
URL: https://scholarbank.nus.edu.sg/handle/10635/165106
Subjects/Keywords: decision making; physician workload; treatment outcome; decision fatigue; race concordance; sequential effect
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
HAN, Y. (2020). ESSAYS ON PHYSICIAN DECISION MAKING AND HEALTHCARE PROVIDER EFFICIENCY. (Thesis). National University of Singapore. Retrieved from https://scholarbank.nus.edu.sg/handle/10635/165106
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
HAN, YE. “ESSAYS ON PHYSICIAN DECISION MAKING AND HEALTHCARE PROVIDER EFFICIENCY.” 2020. Thesis, National University of Singapore. Accessed March 04, 2021.
https://scholarbank.nus.edu.sg/handle/10635/165106.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
HAN, YE. “ESSAYS ON PHYSICIAN DECISION MAKING AND HEALTHCARE PROVIDER EFFICIENCY.” 2020. Web. 04 Mar 2021.
Vancouver:
HAN Y. ESSAYS ON PHYSICIAN DECISION MAKING AND HEALTHCARE PROVIDER EFFICIENCY. [Internet] [Thesis]. National University of Singapore; 2020. [cited 2021 Mar 04].
Available from: https://scholarbank.nus.edu.sg/handle/10635/165106.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
HAN Y. ESSAYS ON PHYSICIAN DECISION MAKING AND HEALTHCARE PROVIDER EFFICIENCY. [Thesis]. National University of Singapore; 2020. Available from: https://scholarbank.nus.edu.sg/handle/10635/165106
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Penn State University
29.
Yao, Bing.
Physical-Statistical Modeling and Optimization of Complex Systems - Healthcare and Manufacturing Applications.
Degree: 2019, Penn State University
URL: https://submit-etda.libraries.psu.edu/catalog/16768bzy111
► The rapid development in sensing and information technology facilitate the effective modeling, monitoring, and control of complex systems. Advanced sensing and imaging have brought a…
(more)
▼ The rapid development in sensing and information technology facilitate the effective modeling, monitoring, and control of complex systems. Advanced sensing and imaging have brought a data-rich environment and provided unprecedented opportunities to investigate system dynamics and further optimize
decision making for smart health and advanced manufacturing. However, the sensing data is generally with high-dimensionality and complex structures. Realizing full potentials of sensing data depends to a great extent on novel analytical methods and tools with effective information-processing capabilities.
The objective of this dissertation is to advance the knowledge on sensor-based system monitoring, modeling, and optimization by developing innovative physical-statistical
methods for smart health and advanced manufacturing. This research will enable and assist in 1) handling high-dimensional spatiotemporal data; 2) extracting pertinent information about system dynamics; 3) optimizing
decision making under uncertainty. My research accomplishments include:
• Energy-efficient mobile ECG sensing: In Chapter 2, an energy-efficient framework is proposed for mobile ECG sensing through the constrained Markov
decision process, where the sensing policy is optimized by maximizing the detection accuracy of cardiac events under the constraint of energy budget.
• Physical-statistical modeling of space-time complex systems: In Chapter 3, a physics-driven spatiotemporal regularization method is developed for high-dimensional predictive modeling. This model not only captures the physics-based interrelationship between time-varying explanatory and response variables that are distributed in the space, but also addresses the spatial and temporal regularizations to improve the prediction performance.
• Spatiotemporal inverse ECG modeling: In Chapter 4, a robust inverse ECG model with spatiotemporal regularization is developed to reconstruct the heart-surface electric potentials from body-surface sensor measurements. Furthermore, a wavelet-clustering method is proposed to investigate the cardiac pathological behaviors from the reconstructed heart signals and characterize the location and extent of myocardial infarctions on the heart surface.
• Multifractal analysis for nonlinear pattern characterization: In Chapter 5, a multifractal approach is developed to quantify the nonlinear and nonhomogeneous patterns in image profiles for defects identification and characterization in additive manufacturing (AM).
•
Sequential optimization and real-time control of additive manufacturing processes: In Chapter 6, a
sequential decision-
making framework through the Markov
decision process is proposed to optimize the AM build quality layer-by-layer. This framework enables on-the-fly assessment of AM build quality and real-time defect mitigation.
Advisors/Committee Members: Hui Yang, Dissertation Advisor/Co-Advisor, Hui Yang, Committee Chair/Co-Chair, Soundar Kumara, Committee Member, Eunhye Song, Committee Member, Xingyuan Fang, Outside Member, Soraya M Samii, Special Member.
Subjects/Keywords: Physical-statistical modeling and optimization; Spatiotemporal regularization; Sequential decision making; Energy-efficient mobile sensing; Multifractal analysis; Quality control; Additive Manufacturing; Inverse ECG modeling
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Yao, B. (2019). Physical-Statistical Modeling and Optimization of Complex Systems - Healthcare and Manufacturing Applications. (Thesis). Penn State University. Retrieved from https://submit-etda.libraries.psu.edu/catalog/16768bzy111
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Yao, Bing. “Physical-Statistical Modeling and Optimization of Complex Systems - Healthcare and Manufacturing Applications.” 2019. Thesis, Penn State University. Accessed March 04, 2021.
https://submit-etda.libraries.psu.edu/catalog/16768bzy111.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Yao, Bing. “Physical-Statistical Modeling and Optimization of Complex Systems - Healthcare and Manufacturing Applications.” 2019. Web. 04 Mar 2021.
Vancouver:
Yao B. Physical-Statistical Modeling and Optimization of Complex Systems - Healthcare and Manufacturing Applications. [Internet] [Thesis]. Penn State University; 2019. [cited 2021 Mar 04].
Available from: https://submit-etda.libraries.psu.edu/catalog/16768bzy111.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Yao B. Physical-Statistical Modeling and Optimization of Complex Systems - Healthcare and Manufacturing Applications. [Thesis]. Penn State University; 2019. Available from: https://submit-etda.libraries.psu.edu/catalog/16768bzy111
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Oxford
30.
Monteiro, Pedro Tiago dos Santos.
Experimental studies in simple choice behaviour.
Degree: PhD, 2013, University of Oxford
URL: http://ora.ox.ac.uk/objects/uuid:ae36e6ba-c4ff-4b5f-9f49-5c921707baa2
;
https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.581223
► This thesis addresses decision mechanisms in foraging situations, using laboratory experiments with European starlings (Sturnus vulgaris). Building on previous work from the Behavioural Ecology Research…
(more)
▼ This thesis addresses decision mechanisms in foraging situations, using laboratory experiments with European starlings (Sturnus vulgaris). Building on previous work from the Behavioural Ecology Research Group, I chose the Sequential Choice Model (SCM; reviewed in Kacelnik et al., 2011 − Appendix 1) as a starting point, and tested its premises and predictions generalising it to different experimental protocols. Classical decision models do not relate choice preferences to behaviour towards isolated options, and assume that choices involve time-consuming evaluations of all alternatives. However, previous work found that starlings’ responses to isolated options predict preference in choices, and that response times to single-option encounters are not reliably longer than response times in choices. Since, in the wild, options are normally encountered sequentially, dealing with isolated options can be considered of greater biological, and possibly psychological, significance than simultaneous decisions. Following this rationale, the SCM postulates that when multiple simultaneous stimuli are met they are processed in parallel, each competing against the memory of background opportunities, rather than comparing present options to each other. At the time of launching this research, these ideas had only been applied to protocols involving just two deterministic alternatives and offering no chance to explore the influence of learning history (i.e., how animals learn to choose; see Chapter 4). To increase their relevance and offer more rigorous tests, I generalised them to situations with multiple (see Chapters 2, 4 and 5), and in some cases probabilistic alternatives (see Chapter 3), controlling the learning regime. I combined these extensions with tests of economic rationality (see Chapter 6), a concept that is presently facing sustained debates. Integrating the result of all experimental chapters (see Chapter 7), my results support the notion that behaviour in single-option encounters is fundamental to understand choice behaviour. The important issue of whether choices involve a decision time cost or the opposite, a shortening of response times, remains unsolved, as neither could be evidenced reliably.
Subjects/Keywords: 150.72; Life Sciences; Biology; Zoological sciences; Behaviour (zoology); Psychology; Cognition; Experimental psychology; Learning; Choice; Decision making; starlings; Sequential Choice Model; latencies; foraging; rationality
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Monteiro, P. T. d. S. (2013). Experimental studies in simple choice behaviour. (Doctoral Dissertation). University of Oxford. Retrieved from http://ora.ox.ac.uk/objects/uuid:ae36e6ba-c4ff-4b5f-9f49-5c921707baa2 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.581223
Chicago Manual of Style (16th Edition):
Monteiro, Pedro Tiago dos Santos. “Experimental studies in simple choice behaviour.” 2013. Doctoral Dissertation, University of Oxford. Accessed March 04, 2021.
http://ora.ox.ac.uk/objects/uuid:ae36e6ba-c4ff-4b5f-9f49-5c921707baa2 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.581223.
MLA Handbook (7th Edition):
Monteiro, Pedro Tiago dos Santos. “Experimental studies in simple choice behaviour.” 2013. Web. 04 Mar 2021.
Vancouver:
Monteiro PTdS. Experimental studies in simple choice behaviour. [Internet] [Doctoral dissertation]. University of Oxford; 2013. [cited 2021 Mar 04].
Available from: http://ora.ox.ac.uk/objects/uuid:ae36e6ba-c4ff-4b5f-9f49-5c921707baa2 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.581223.
Council of Science Editors:
Monteiro PTdS. Experimental studies in simple choice behaviour. [Doctoral Dissertation]. University of Oxford; 2013. Available from: http://ora.ox.ac.uk/objects/uuid:ae36e6ba-c4ff-4b5f-9f49-5c921707baa2 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.581223
◁ [1] [2] [3] ▶
.