Full Record

Author | Das, Indrajit |

Title | Inverse reinforcement learning of risk-sensitive utility |

URL | http://purl.galileo.usg.edu/uga_etd/das_indrajit_201608_ms |

Publication Date | 2016 |

Date Available | 2017-03-28 04:31:15 |

Date Accessioned | 2017-03-28 04:31:15 |

Degree | MS |

Discipline/Department | Computer Science |

Degree Level | masters |

University/Publisher | University of Georgia |

Abstract | The uncertain and stochastic nature of the real world poses a challenge for autonomous cars in making decisions to ensure appropriate motion, considering the safety of the passengers and other cars that may or may not be autonomous. It is crucial for these systems to learn driving patterns of other vehicles from their environment in order to predict their movement for a better decision making. In this research, we focus on solving the highway merging problem, where an autonomous vehicle tries to merge onto a highway by using Inverse Reinforcement Learning. Human behavior is complex, and both linear and exponential utility functions fail to capture the non-linearity associated with such decision making. To resolve this issue, we model such behavior with a One-Switch utility function. We present an Inverse Reinforcement Learning technique that allows an autonomous vehicle to predict human driving patterns to efficiently merge onto a highway by modeling risk with a one- switch utility function. |

Subjects/Keywords | Inverse Reinforcement Learning |

Contributors | Prashant Doshi |

Language | en |

Rights | public |

Country of Publication | us |

Record ID | oai:ugakr.libs.uga.edu:10724/36698 |

Repository | uga |

Date Retrieved | 2017-03-31 |

Date Indexed | 2017-03-31 |

Note | [degree] MS; [department] Computer Science; [major] Computer Science; [advisor] Prashant Doshi; [committee] Prashant Doshi; |

Sample Search Hits | Sample Images | Cited Works

…4
2 Background
5
2.1
Utility Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.2
Markov Decision Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.3
*Inverse* *Reinforcement* *Learning*…

…function is a class of utility functions that can model
such decision makers using cumulative rewards. We apply *Inverse* *Reinforcement* *Learning*
(IRL) on Markov Decision Process (MDP) framework in *learning* the reward functions that…

…We have developed an *Inverse* *Reinforcement* *Learning* (IRL) technique for *learning* a
non-decomposable one-switch (1s) utility function.
3 We are validating the *Inverse* *Reinforcement* *Learning* algorithm with the help of a
demo problem…

…document has been structured into nine chapters. Chapter 2 gives a brief review of
Markov Decision Processes, Utility Theory and *Inverse* *Reinforcement* *Learning*, which are
the building blocks of both the problem and the solution domain. Chapter 3 covers the…

…utility theory and one switch utility functions followed
by Markov Decision Process (MDP), which is a framework for sequential decision making
under uncertain environments. Finally, we introduce *Inverse* *Reinforcement* *Learning* (IRL)
and…

…s, a, s0 ))]
(2.5)
s0 ∈S
2.3
*Inverse* *Reinforcement* *Learning*
In *Inverse* *Reinforcement* *Learning* (IRL), we primarily deal with two kinds of agents, namely
the expert and the learner. The expert is assumed to be…

…generally as possible”. The resultant distribution matches the given constraints, but is
otherwise completely unbiased. In case of *Inverse* *Reinforcement* *Learning*, we formulate this
problem as finding the distribution over policies such that the feature…

…making is sub-rational. In the current thesis, we direct our attention towards
*learning* a risk prone behavior of the human drivers. To address this issue, we designed a car
system, referred as the ABC car system. Figure 1.1 is the diagrammatic…