Advanced search options
You searched for subject:(Policy gradient)
.
Showing records 1 – 25 of
25 total matches.
▼ Search Limiters
University of Alberta
1. Dick, Travis B. Policy Gradient Reinforcement Learning Without Regret.
Degree: MS, Department of Computing Science, 2015, University of Alberta
URL: https://era.library.ualberta.ca/files/df65vb663
Subjects/Keywords: Policy Gradient; Baseline; Reinforcement Learning
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Dick, T. B. (2015). Policy Gradient Reinforcement Learning Without Regret. (Masters Thesis). University of Alberta. Retrieved from https://era.library.ualberta.ca/files/df65vb663
Chicago Manual of Style (16th Edition):
Dick, Travis B. “Policy Gradient Reinforcement Learning Without Regret.” 2015. Masters Thesis, University of Alberta. Accessed January 24, 2021. https://era.library.ualberta.ca/files/df65vb663.
MLA Handbook (7th Edition):
Dick, Travis B. “Policy Gradient Reinforcement Learning Without Regret.” 2015. Web. 24 Jan 2021.
Vancouver:
Dick TB. Policy Gradient Reinforcement Learning Without Regret. [Internet] [Masters thesis]. University of Alberta; 2015. [cited 2021 Jan 24]. Available from: https://era.library.ualberta.ca/files/df65vb663.
Council of Science Editors:
Dick TB. Policy Gradient Reinforcement Learning Without Regret. [Masters Thesis]. University of Alberta; 2015. Available from: https://era.library.ualberta.ca/files/df65vb663
Arizona State University
2. Barron, Trevor Paul. Adaptive Curvature for Stochastic Optimization.
Degree: Computer Science, 2019, Arizona State University
URL: http://repository.asu.edu/items/53675
Subjects/Keywords: Statistics; Robotics; Natural gradient descent; Policy gradient methods; Truncated Newton methods
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Barron, T. P. (2019). Adaptive Curvature for Stochastic Optimization. (Masters Thesis). Arizona State University. Retrieved from http://repository.asu.edu/items/53675
Chicago Manual of Style (16th Edition):
Barron, Trevor Paul. “Adaptive Curvature for Stochastic Optimization.” 2019. Masters Thesis, Arizona State University. Accessed January 24, 2021. http://repository.asu.edu/items/53675.
MLA Handbook (7th Edition):
Barron, Trevor Paul. “Adaptive Curvature for Stochastic Optimization.” 2019. Web. 24 Jan 2021.
Vancouver:
Barron TP. Adaptive Curvature for Stochastic Optimization. [Internet] [Masters thesis]. Arizona State University; 2019. [cited 2021 Jan 24]. Available from: http://repository.asu.edu/items/53675.
Council of Science Editors:
Barron TP. Adaptive Curvature for Stochastic Optimization. [Masters Thesis]. Arizona State University; 2019. Available from: http://repository.asu.edu/items/53675
University of Alberta
3. Hackman, Leah M. Faster Gradient-TD Algorithms.
Degree: MS, Department of Computing Science, 2012, University of Alberta
URL: https://era.library.ualberta.ca/files/6w924d09v
Subjects/Keywords: Aritificial; Off-Policy; Gradient-TD; Intelligence; Reinforcement; Hybrid-GQ; Learning
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Hackman, L. M. (2012). Faster Gradient-TD Algorithms. (Masters Thesis). University of Alberta. Retrieved from https://era.library.ualberta.ca/files/6w924d09v
Chicago Manual of Style (16th Edition):
Hackman, Leah M. “Faster Gradient-TD Algorithms.” 2012. Masters Thesis, University of Alberta. Accessed January 24, 2021. https://era.library.ualberta.ca/files/6w924d09v.
MLA Handbook (7th Edition):
Hackman, Leah M. “Faster Gradient-TD Algorithms.” 2012. Web. 24 Jan 2021.
Vancouver:
Hackman LM. Faster Gradient-TD Algorithms. [Internet] [Masters thesis]. University of Alberta; 2012. [cited 2021 Jan 24]. Available from: https://era.library.ualberta.ca/files/6w924d09v.
Council of Science Editors:
Hackman LM. Faster Gradient-TD Algorithms. [Masters Thesis]. University of Alberta; 2012. Available from: https://era.library.ualberta.ca/files/6w924d09v
University of Alberta
4. Das Gupta, Ujjwal. Adaptive Representation for Policy Gradient.
Degree: MS, Department of Computing Science, 2015, University of Alberta
URL: https://era.library.ualberta.ca/files/zk51vk289
Subjects/Keywords: Representation Learning; Decision Trees; Policy Gradient; Reinforcement Learning
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Das Gupta, U. (2015). Adaptive Representation for Policy Gradient. (Masters Thesis). University of Alberta. Retrieved from https://era.library.ualberta.ca/files/zk51vk289
Chicago Manual of Style (16th Edition):
Das Gupta, Ujjwal. “Adaptive Representation for Policy Gradient.” 2015. Masters Thesis, University of Alberta. Accessed January 24, 2021. https://era.library.ualberta.ca/files/zk51vk289.
MLA Handbook (7th Edition):
Das Gupta, Ujjwal. “Adaptive Representation for Policy Gradient.” 2015. Web. 24 Jan 2021.
Vancouver:
Das Gupta U. Adaptive Representation for Policy Gradient. [Internet] [Masters thesis]. University of Alberta; 2015. [cited 2021 Jan 24]. Available from: https://era.library.ualberta.ca/files/zk51vk289.
Council of Science Editors:
Das Gupta U. Adaptive Representation for Policy Gradient. [Masters Thesis]. University of Alberta; 2015. Available from: https://era.library.ualberta.ca/files/zk51vk289
Delft University of Technology
5. Goedhart, Menno (author). Intelligent Flapping Wing Control: Reinforcement Learning for the DelFly.
Degree: 2017, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:0f103559-d985-47b7-b145-fc814527f307
Subjects/Keywords: Reinforcement Learning; DelFly; Flapping Wing; Classification; Machine Learning; Policy Gradient
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Goedhart, M. (. (2017). Intelligent Flapping Wing Control: Reinforcement Learning for the DelFly. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:0f103559-d985-47b7-b145-fc814527f307
Chicago Manual of Style (16th Edition):
Goedhart, Menno (author). “Intelligent Flapping Wing Control: Reinforcement Learning for the DelFly.” 2017. Masters Thesis, Delft University of Technology. Accessed January 24, 2021. http://resolver.tudelft.nl/uuid:0f103559-d985-47b7-b145-fc814527f307.
MLA Handbook (7th Edition):
Goedhart, Menno (author). “Intelligent Flapping Wing Control: Reinforcement Learning for the DelFly.” 2017. Web. 24 Jan 2021.
Vancouver:
Goedhart M(. Intelligent Flapping Wing Control: Reinforcement Learning for the DelFly. [Internet] [Masters thesis]. Delft University of Technology; 2017. [cited 2021 Jan 24]. Available from: http://resolver.tudelft.nl/uuid:0f103559-d985-47b7-b145-fc814527f307.
Council of Science Editors:
Goedhart M(. Intelligent Flapping Wing Control: Reinforcement Learning for the DelFly. [Masters Thesis]. Delft University of Technology; 2017. Available from: http://resolver.tudelft.nl/uuid:0f103559-d985-47b7-b145-fc814527f307
6. Poulin, Nolan. Proactive Planning through Active Policy Inference in Stochastic Environments.
Degree: MS, 2018, Worcester Polytechnic Institute
URL: etd-052818-100711
;
https://digitalcommons.wpi.edu/etd-theses/1267
Subjects/Keywords: active learning markov decision process softmax boltzmann policy gradient
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Poulin, N. (2018). Proactive Planning through Active Policy Inference in Stochastic Environments. (Thesis). Worcester Polytechnic Institute. Retrieved from etd-052818-100711 ; https://digitalcommons.wpi.edu/etd-theses/1267
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Poulin, Nolan. “Proactive Planning through Active Policy Inference in Stochastic Environments.” 2018. Thesis, Worcester Polytechnic Institute. Accessed January 24, 2021. etd-052818-100711 ; https://digitalcommons.wpi.edu/etd-theses/1267.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Poulin, Nolan. “Proactive Planning through Active Policy Inference in Stochastic Environments.” 2018. Web. 24 Jan 2021.
Vancouver:
Poulin N. Proactive Planning through Active Policy Inference in Stochastic Environments. [Internet] [Thesis]. Worcester Polytechnic Institute; 2018. [cited 2021 Jan 24]. Available from: etd-052818-100711 ; https://digitalcommons.wpi.edu/etd-theses/1267.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Poulin N. Proactive Planning through Active Policy Inference in Stochastic Environments. [Thesis]. Worcester Polytechnic Institute; 2018. Available from: etd-052818-100711 ; https://digitalcommons.wpi.edu/etd-theses/1267
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
7. Könönen, Ville. Multiagent Reinforcement Learning in Markov Games: Asymmetric and Symmetric Approaches.
Degree: 2004, Helsinki University of Technology
URL: http://lib.tkk.fi/Diss/2004/isbn9512273594/
Subjects/Keywords: Markov games; reinforcement learning; Nash equilibrium; Stackelberg equilibrium; value function approximation; policy gradient
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Könönen, V. (2004). Multiagent Reinforcement Learning in Markov Games: Asymmetric and Symmetric Approaches. (Thesis). Helsinki University of Technology. Retrieved from http://lib.tkk.fi/Diss/2004/isbn9512273594/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Könönen, Ville. “Multiagent Reinforcement Learning in Markov Games: Asymmetric and Symmetric Approaches.” 2004. Thesis, Helsinki University of Technology. Accessed January 24, 2021. http://lib.tkk.fi/Diss/2004/isbn9512273594/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Könönen, Ville. “Multiagent Reinforcement Learning in Markov Games: Asymmetric and Symmetric Approaches.” 2004. Web. 24 Jan 2021.
Vancouver:
Könönen V. Multiagent Reinforcement Learning in Markov Games: Asymmetric and Symmetric Approaches. [Internet] [Thesis]. Helsinki University of Technology; 2004. [cited 2021 Jan 24]. Available from: http://lib.tkk.fi/Diss/2004/isbn9512273594/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Könönen V. Multiagent Reinforcement Learning in Markov Games: Asymmetric and Symmetric Approaches. [Thesis]. Helsinki University of Technology; 2004. Available from: http://lib.tkk.fi/Diss/2004/isbn9512273594/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
University of Alberta
8. Maei, Hamid Reza. Gradient Temporal-Difference Learning Algorithms.
Degree: PhD, Department of Computing Science, 2011, University of Alberta
URL: https://era.library.ualberta.ca/files/8s45q967t
Subjects/Keywords: Policy Evaluation; Temporal-Difference learning; Reinforcement Learning; Stochastic Gradient-Descent; Value Function Approximation
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Maei, H. R. (2011). Gradient Temporal-Difference Learning Algorithms. (Doctoral Dissertation). University of Alberta. Retrieved from https://era.library.ualberta.ca/files/8s45q967t
Chicago Manual of Style (16th Edition):
Maei, Hamid Reza. “Gradient Temporal-Difference Learning Algorithms.” 2011. Doctoral Dissertation, University of Alberta. Accessed January 24, 2021. https://era.library.ualberta.ca/files/8s45q967t.
MLA Handbook (7th Edition):
Maei, Hamid Reza. “Gradient Temporal-Difference Learning Algorithms.” 2011. Web. 24 Jan 2021.
Vancouver:
Maei HR. Gradient Temporal-Difference Learning Algorithms. [Internet] [Doctoral dissertation]. University of Alberta; 2011. [cited 2021 Jan 24]. Available from: https://era.library.ualberta.ca/files/8s45q967t.
Council of Science Editors:
Maei HR. Gradient Temporal-Difference Learning Algorithms. [Doctoral Dissertation]. University of Alberta; 2011. Available from: https://era.library.ualberta.ca/files/8s45q967t
University of Washington
9. Deole, Aditya. Model Free Optimal Control Approach for UAVs.
Degree: 2020, University of Washington
URL: http://hdl.handle.net/1773/46114
Subjects/Keywords: LQR; Model-free; Policy gradient; Q-learning; UAV; Mechanical engineering; Mechanical engineering
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Deole, A. (2020). Model Free Optimal Control Approach for UAVs. (Thesis). University of Washington. Retrieved from http://hdl.handle.net/1773/46114
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Deole, Aditya. “Model Free Optimal Control Approach for UAVs.” 2020. Thesis, University of Washington. Accessed January 24, 2021. http://hdl.handle.net/1773/46114.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Deole, Aditya. “Model Free Optimal Control Approach for UAVs.” 2020. Web. 24 Jan 2021.
Vancouver:
Deole A. Model Free Optimal Control Approach for UAVs. [Internet] [Thesis]. University of Washington; 2020. [cited 2021 Jan 24]. Available from: http://hdl.handle.net/1773/46114.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Deole A. Model Free Optimal Control Approach for UAVs. [Thesis]. University of Washington; 2020. Available from: http://hdl.handle.net/1773/46114
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
University of North Texas
10. Cox, Carissa. Spatial Patterns in Development Regulation: Tree Preservation Ordinances of the DFW Metropolitan Area.
Degree: 2011, University of North Texas
URL: https://digital.library.unt.edu/ark:/67531/metadc84194/
Subjects/Keywords: Tree preservation; urban rural gradient; hot spot rendering; preservation policy; DFW; ordinances; internal evaluation
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Cox, C. (2011). Spatial Patterns in Development Regulation: Tree Preservation Ordinances of the DFW Metropolitan Area. (Thesis). University of North Texas. Retrieved from https://digital.library.unt.edu/ark:/67531/metadc84194/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Cox, Carissa. “Spatial Patterns in Development Regulation: Tree Preservation Ordinances of the DFW Metropolitan Area.” 2011. Thesis, University of North Texas. Accessed January 24, 2021. https://digital.library.unt.edu/ark:/67531/metadc84194/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Cox, Carissa. “Spatial Patterns in Development Regulation: Tree Preservation Ordinances of the DFW Metropolitan Area.” 2011. Web. 24 Jan 2021.
Vancouver:
Cox C. Spatial Patterns in Development Regulation: Tree Preservation Ordinances of the DFW Metropolitan Area. [Internet] [Thesis]. University of North Texas; 2011. [cited 2021 Jan 24]. Available from: https://digital.library.unt.edu/ark:/67531/metadc84194/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Cox C. Spatial Patterns in Development Regulation: Tree Preservation Ordinances of the DFW Metropolitan Area. [Thesis]. University of North Texas; 2011. Available from: https://digital.library.unt.edu/ark:/67531/metadc84194/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
KTH
11. Olafsson, Björgvin. Partially Observable Markov Decision Processes for Faster Object Recognition.
Degree: Computer Science and Communication (CSC), 2016, KTH
URL: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-198632
Subjects/Keywords: pomdp; policy gradient; optimal control; object detection; computer vision; information rewards; fixation policy; observation model; Computer Sciences; Datavetenskap (datalogi)
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Olafsson, B. (2016). Partially Observable Markov Decision Processes for Faster Object Recognition. (Thesis). KTH. Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-198632
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Olafsson, Björgvin. “Partially Observable Markov Decision Processes for Faster Object Recognition.” 2016. Thesis, KTH. Accessed January 24, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-198632.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Olafsson, Björgvin. “Partially Observable Markov Decision Processes for Faster Object Recognition.” 2016. Web. 24 Jan 2021.
Vancouver:
Olafsson B. Partially Observable Markov Decision Processes for Faster Object Recognition. [Internet] [Thesis]. KTH; 2016. [cited 2021 Jan 24]. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-198632.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Olafsson B. Partially Observable Markov Decision Processes for Faster Object Recognition. [Thesis]. KTH; 2016. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-198632
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
University of Alberta
12. Bastani, Meysam. Model-Free Intelligent Diabetes Management Using Machine Learning.
Degree: MS, Department of Computing Science, 2013, University of Alberta
URL: https://era.library.ualberta.ca/files/c12579s341
Subjects/Keywords: reinforcement learning; policy gradient; diabetes; insulin dosage adjustment; supervised learning; machine learning; type-1 diabetes; actor-critic
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Bastani, M. (2013). Model-Free Intelligent Diabetes Management Using Machine Learning. (Masters Thesis). University of Alberta. Retrieved from https://era.library.ualberta.ca/files/c12579s341
Chicago Manual of Style (16th Edition):
Bastani, Meysam. “Model-Free Intelligent Diabetes Management Using Machine Learning.” 2013. Masters Thesis, University of Alberta. Accessed January 24, 2021. https://era.library.ualberta.ca/files/c12579s341.
MLA Handbook (7th Edition):
Bastani, Meysam. “Model-Free Intelligent Diabetes Management Using Machine Learning.” 2013. Web. 24 Jan 2021.
Vancouver:
Bastani M. Model-Free Intelligent Diabetes Management Using Machine Learning. [Internet] [Masters thesis]. University of Alberta; 2013. [cited 2021 Jan 24]. Available from: https://era.library.ualberta.ca/files/c12579s341.
Council of Science Editors:
Bastani M. Model-Free Intelligent Diabetes Management Using Machine Learning. [Masters Thesis]. University of Alberta; 2013. Available from: https://era.library.ualberta.ca/files/c12579s341
Halmstad University
13. Olsson, Anton. Domain Transfer for End-to-end Reinforcement Learning.
Degree: Information Technology, 2020, Halmstad University
URL: http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-43042
Subjects/Keywords: Reinforcement Learning; Domain Transfer; Deep Deterministic Policy Gradient; Reinforcement Learning in Real-time; Computer Sciences; Datavetenskap (datalogi); Computer Engineering; Datorteknik
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Olsson, A. (2020). Domain Transfer for End-to-end Reinforcement Learning. (Thesis). Halmstad University. Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-43042
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Olsson, Anton. “Domain Transfer for End-to-end Reinforcement Learning.” 2020. Thesis, Halmstad University. Accessed January 24, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-43042.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Olsson, Anton. “Domain Transfer for End-to-end Reinforcement Learning.” 2020. Web. 24 Jan 2021.
Vancouver:
Olsson A. Domain Transfer for End-to-end Reinforcement Learning. [Internet] [Thesis]. Halmstad University; 2020. [cited 2021 Jan 24]. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-43042.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Olsson A. Domain Transfer for End-to-end Reinforcement Learning. [Thesis]. Halmstad University; 2020. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-43042
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
University of Waterloo
14. Pereira, Sahil. Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments.
Degree: 2020, University of Waterloo
URL: http://hdl.handle.net/10012/15851
Subjects/Keywords: reinforcement learning; multi-agent; stackelberg model; hierarchical environments; game theory; machine learning; continuous space; policy gradient; markov games; actor critic
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Pereira, S. (2020). Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments. (Thesis). University of Waterloo. Retrieved from http://hdl.handle.net/10012/15851
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Pereira, Sahil. “Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments.” 2020. Thesis, University of Waterloo. Accessed January 24, 2021. http://hdl.handle.net/10012/15851.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Pereira, Sahil. “Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments.” 2020. Web. 24 Jan 2021.
Vancouver:
Pereira S. Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments. [Internet] [Thesis]. University of Waterloo; 2020. [cited 2021 Jan 24]. Available from: http://hdl.handle.net/10012/15851.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Pereira S. Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments. [Thesis]. University of Waterloo; 2020. Available from: http://hdl.handle.net/10012/15851
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Australian National University
15. Aberdeen, Douglas. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes .
Degree: 2003, Australian National University
URL: http://hdl.handle.net/1885/48180
Subjects/Keywords: POMDP; Reinforcement Learning; Policy gradient; cluster; high performance computing
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Aberdeen, D. (2003). Policy-Gradient Algorithms for Partially Observable Markov Decision Processes . (Thesis). Australian National University. Retrieved from http://hdl.handle.net/1885/48180
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Aberdeen, Douglas. “Policy-Gradient Algorithms for Partially Observable Markov Decision Processes .” 2003. Thesis, Australian National University. Accessed January 24, 2021. http://hdl.handle.net/1885/48180.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Aberdeen, Douglas. “Policy-Gradient Algorithms for Partially Observable Markov Decision Processes .” 2003. Web. 24 Jan 2021.
Vancouver:
Aberdeen D. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes . [Internet] [Thesis]. Australian National University; 2003. [cited 2021 Jan 24]. Available from: http://hdl.handle.net/1885/48180.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Aberdeen D. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes . [Thesis]. Australian National University; 2003. Available from: http://hdl.handle.net/1885/48180
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Australian National University
16. Greensmith, Evan. Policy Gradient Methods: Variance Reduction and Stochastic Convergence .
Degree: 2005, Australian National University
URL: http://hdl.handle.net/1885/47105
Subjects/Keywords: reinforcement learning; policy gradient; stochastic convergence; variance reduction
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Greensmith, E. (2005). Policy Gradient Methods: Variance Reduction and Stochastic Convergence . (Thesis). Australian National University. Retrieved from http://hdl.handle.net/1885/47105
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Greensmith, Evan. “Policy Gradient Methods: Variance Reduction and Stochastic Convergence .” 2005. Thesis, Australian National University. Accessed January 24, 2021. http://hdl.handle.net/1885/47105.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Greensmith, Evan. “Policy Gradient Methods: Variance Reduction and Stochastic Convergence .” 2005. Web. 24 Jan 2021.
Vancouver:
Greensmith E. Policy Gradient Methods: Variance Reduction and Stochastic Convergence . [Internet] [Thesis]. Australian National University; 2005. [cited 2021 Jan 24]. Available from: http://hdl.handle.net/1885/47105.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Greensmith E. Policy Gradient Methods: Variance Reduction and Stochastic Convergence . [Thesis]. Australian National University; 2005. Available from: http://hdl.handle.net/1885/47105
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
University of South Florida
17. Michaud, Brianna. A Habitat Analysis of Estuarine Fishes and Invertebrates, with Observations on the Effects of Habitat-Factor Resolution.
Degree: 2016, University of South Florida
URL: https://scholarcommons.usf.edu/etd/6543
Subjects/Keywords: multivariate analysis; community; estuarine management; estuarine gradient; Natural Resources Management and Policy; Other Oceanography and Atmospheric Sciences and Meteorology
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Michaud, B. (2016). A Habitat Analysis of Estuarine Fishes and Invertebrates, with Observations on the Effects of Habitat-Factor Resolution. (Thesis). University of South Florida. Retrieved from https://scholarcommons.usf.edu/etd/6543
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Michaud, Brianna. “A Habitat Analysis of Estuarine Fishes and Invertebrates, with Observations on the Effects of Habitat-Factor Resolution.” 2016. Thesis, University of South Florida. Accessed January 24, 2021. https://scholarcommons.usf.edu/etd/6543.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Michaud, Brianna. “A Habitat Analysis of Estuarine Fishes and Invertebrates, with Observations on the Effects of Habitat-Factor Resolution.” 2016. Web. 24 Jan 2021.
Vancouver:
Michaud B. A Habitat Analysis of Estuarine Fishes and Invertebrates, with Observations on the Effects of Habitat-Factor Resolution. [Internet] [Thesis]. University of South Florida; 2016. [cited 2021 Jan 24]. Available from: https://scholarcommons.usf.edu/etd/6543.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Michaud B. A Habitat Analysis of Estuarine Fishes and Invertebrates, with Observations on the Effects of Habitat-Factor Resolution. [Thesis]. University of South Florida; 2016. Available from: https://scholarcommons.usf.edu/etd/6543
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
18. Aklil, Nassim. Apprentissage actif sous contrainte de budget en robotique et en neurosciences computationnelles. Localisation robotique et modélisation comportementale en environnement non stationnaire : Active learning under budget constraint in robotics and computational neuroscience. Robotic localization and behavioral modeling in non-stationary environment.
Degree: Docteur es, Intelligence Artificielle et Robotique, 2017, Université Pierre et Marie Curie – Paris VI
URL: http://www.theses.fr/2017PA066225
Subjects/Keywords: Apprentissage par renforcement; Apprentissage budgétisé; Apprentissage profond; Neurosciences computationnelles; Compromis exploration/exploitation; Policy gradient; Budgeted learning; Computational neuroscience; Deep learning; 629.89
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Aklil, N. (2017). Apprentissage actif sous contrainte de budget en robotique et en neurosciences computationnelles. Localisation robotique et modélisation comportementale en environnement non stationnaire : Active learning under budget constraint in robotics and computational neuroscience. Robotic localization and behavioral modeling in non-stationary environment. (Doctoral Dissertation). Université Pierre et Marie Curie – Paris VI. Retrieved from http://www.theses.fr/2017PA066225
Chicago Manual of Style (16th Edition):
Aklil, Nassim. “Apprentissage actif sous contrainte de budget en robotique et en neurosciences computationnelles. Localisation robotique et modélisation comportementale en environnement non stationnaire : Active learning under budget constraint in robotics and computational neuroscience. Robotic localization and behavioral modeling in non-stationary environment.” 2017. Doctoral Dissertation, Université Pierre et Marie Curie – Paris VI. Accessed January 24, 2021. http://www.theses.fr/2017PA066225.
MLA Handbook (7th Edition):
Aklil, Nassim. “Apprentissage actif sous contrainte de budget en robotique et en neurosciences computationnelles. Localisation robotique et modélisation comportementale en environnement non stationnaire : Active learning under budget constraint in robotics and computational neuroscience. Robotic localization and behavioral modeling in non-stationary environment.” 2017. Web. 24 Jan 2021.
Vancouver:
Aklil N. Apprentissage actif sous contrainte de budget en robotique et en neurosciences computationnelles. Localisation robotique et modélisation comportementale en environnement non stationnaire : Active learning under budget constraint in robotics and computational neuroscience. Robotic localization and behavioral modeling in non-stationary environment. [Internet] [Doctoral dissertation]. Université Pierre et Marie Curie – Paris VI; 2017. [cited 2021 Jan 24]. Available from: http://www.theses.fr/2017PA066225.
Council of Science Editors:
Aklil N. Apprentissage actif sous contrainte de budget en robotique et en neurosciences computationnelles. Localisation robotique et modélisation comportementale en environnement non stationnaire : Active learning under budget constraint in robotics and computational neuroscience. Robotic localization and behavioral modeling in non-stationary environment. [Doctoral Dissertation]. Université Pierre et Marie Curie – Paris VI; 2017. Available from: http://www.theses.fr/2017PA066225
Cal Poly
19. McDowell, Journey. Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks.
Degree: MS, Mechanical Engineering, 2019, Cal Poly
URL: https://digitalcommons.calpoly.edu/theses/2100
;
10.15368/theses.2019.117
Subjects/Keywords: Linear Quadratic Regulator; Deep Deterministic Policy Gradient; LQR; DDPG; Autonomous Vehicle; Machine Learning; Navigation, Guidance, Control, and Dynamics
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
McDowell, J. (2019). Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks. (Masters Thesis). Cal Poly. Retrieved from https://digitalcommons.calpoly.edu/theses/2100 ; 10.15368/theses.2019.117
Chicago Manual of Style (16th Edition):
McDowell, Journey. “Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks.” 2019. Masters Thesis, Cal Poly. Accessed January 24, 2021. https://digitalcommons.calpoly.edu/theses/2100 ; 10.15368/theses.2019.117.
MLA Handbook (7th Edition):
McDowell, Journey. “Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks.” 2019. Web. 24 Jan 2021.
Vancouver:
McDowell J. Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks. [Internet] [Masters thesis]. Cal Poly; 2019. [cited 2021 Jan 24]. Available from: https://digitalcommons.calpoly.edu/theses/2100 ; 10.15368/theses.2019.117.
Council of Science Editors:
McDowell J. Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks. [Masters Thesis]. Cal Poly; 2019. Available from: https://digitalcommons.calpoly.edu/theses/2100 ; 10.15368/theses.2019.117
20. Bhojraj, Gokul Kaisaravalli. Policy-based Reinforcement learning control for window opening and closing in an office building.
Degree: Microdata Analysis, 2020, Dalarna University
URL: http://urn.kb.se/resolve?urn=urn:nbn:se:du-34420
Subjects/Keywords: Markov decision processes; Policy-based Reinforcement learning; Value-based Reinforcement learning; Q-learning; REINFORCE; policy gradient; window control; indoor comfort level; Social Sciences; Samhällsvetenskap
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Bhojraj, G. K. (2020). Policy-based Reinforcement learning control for window opening and closing in an office building. (Thesis). Dalarna University. Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:du-34420
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Bhojraj, Gokul Kaisaravalli. “Policy-based Reinforcement learning control for window opening and closing in an office building.” 2020. Thesis, Dalarna University. Accessed January 24, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:du-34420.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Bhojraj, Gokul Kaisaravalli. “Policy-based Reinforcement learning control for window opening and closing in an office building.” 2020. Web. 24 Jan 2021.
Vancouver:
Bhojraj GK. Policy-based Reinforcement learning control for window opening and closing in an office building. [Internet] [Thesis]. Dalarna University; 2020. [cited 2021 Jan 24]. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:du-34420.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Bhojraj GK. Policy-based Reinforcement learning control for window opening and closing in an office building. [Thesis]. Dalarna University; 2020. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:du-34420
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
21. -2178-1988. Program analysis techniques for algorithmic complexity and relational properties.
Degree: PhD, Computer Science, 2019, University of Texas – Austin
URL: http://dx.doi.org/10.26153/tsw/2181
Subjects/Keywords: Complexity testing; Optimal program synthesis; Fuzzing; Genetic programming; Performance bug; Vulnerability detection; Side channel; Static analysis; Relational verification; Reinforcement learning; Policy gradient
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
-2178-1988. (2019). Program analysis techniques for algorithmic complexity and relational properties. (Doctoral Dissertation). University of Texas – Austin. Retrieved from http://dx.doi.org/10.26153/tsw/2181
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Chicago Manual of Style (16th Edition):
-2178-1988. “Program analysis techniques for algorithmic complexity and relational properties.” 2019. Doctoral Dissertation, University of Texas – Austin. Accessed January 24, 2021. http://dx.doi.org/10.26153/tsw/2181.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
MLA Handbook (7th Edition):
-2178-1988. “Program analysis techniques for algorithmic complexity and relational properties.” 2019. Web. 24 Jan 2021.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Vancouver:
-2178-1988. Program analysis techniques for algorithmic complexity and relational properties. [Internet] [Doctoral dissertation]. University of Texas – Austin; 2019. [cited 2021 Jan 24]. Available from: http://dx.doi.org/10.26153/tsw/2181.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Council of Science Editors:
-2178-1988. Program analysis techniques for algorithmic complexity and relational properties. [Doctoral Dissertation]. University of Texas – Austin; 2019. Available from: http://dx.doi.org/10.26153/tsw/2181
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
22. 小倉, 裕平. 運動者に対するランニング経路推薦のための方策勾配法に基づくランニング経路生成方法の研究.
Degree: Japan Advanced Institute of Science and Technology / 北陸先端科学技術大学院大学
URL: http://hdl.handle.net/10119/15138
Supervisor:Ho Bao Tu
先端科学技術研究科
修士(知識科学)
Subjects/Keywords: 強化学習; Reinforcement Learning; ランニング経路推薦; Running Route Recommendation; 方策勾配法; Policy Gradient
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
小倉, . (n.d.). 運動者に対するランニング経路推薦のための方策勾配法に基づくランニング経路生成方法の研究. (Thesis). Japan Advanced Institute of Science and Technology / 北陸先端科学技術大学院大学. Retrieved from http://hdl.handle.net/10119/15138
Note: this citation may be lacking information needed for this citation format:
No year of publication.
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
小倉, 裕平. “運動者に対するランニング経路推薦のための方策勾配法に基づくランニング経路生成方法の研究.” Thesis, Japan Advanced Institute of Science and Technology / 北陸先端科学技術大学院大学. Accessed January 24, 2021. http://hdl.handle.net/10119/15138.
Note: this citation may be lacking information needed for this citation format:
No year of publication.
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
小倉, 裕平. “運動者に対するランニング経路推薦のための方策勾配法に基づくランニング経路生成方法の研究.” Web. 24 Jan 2021.
Note: this citation may be lacking information needed for this citation format:
No year of publication.
Vancouver:
小倉 . 運動者に対するランニング経路推薦のための方策勾配法に基づくランニング経路生成方法の研究. [Internet] [Thesis]. Japan Advanced Institute of Science and Technology / 北陸先端科学技術大学院大学; [cited 2021 Jan 24]. Available from: http://hdl.handle.net/10119/15138.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
No year of publication.
Council of Science Editors:
小倉 . 運動者に対するランニング経路推薦のための方策勾配法に基づくランニング経路生成方法の研究. [Thesis]. Japan Advanced Institute of Science and Technology / 北陸先端科学技術大学院大学; Available from: http://hdl.handle.net/10119/15138
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
No year of publication.
23. Van der Spek, I.T. (author). Imitation learning for a robotic precision placement task.
Degree: 2014, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:86815e55-bbba-45b4-915b-6f321b485940
Subjects/Keywords: reinforcement learning; imitation learning; policy gradient; pgpe; dynamic movement primitives; precision placement; dmp; optimization
…third method is the use of a policy gradient method: Policy Gradients with Parameter based… …contributions to the existing work. The first contribution of this thesis is the policy gradient… …algorithm which has not been combined with imitation learning before. The policy gradient… …used for encoding trajectories. Subsequently in Chapter 4 the policy gradient algorithm is… …gradient approaches based on likelihood-ratio estimation, 2) policy updates inspired by…
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Van der Spek, I. T. (. (2014). Imitation learning for a robotic precision placement task. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:86815e55-bbba-45b4-915b-6f321b485940
Chicago Manual of Style (16th Edition):
Van der Spek, I T (author). “Imitation learning for a robotic precision placement task.” 2014. Masters Thesis, Delft University of Technology. Accessed January 24, 2021. http://resolver.tudelft.nl/uuid:86815e55-bbba-45b4-915b-6f321b485940.
MLA Handbook (7th Edition):
Van der Spek, I T (author). “Imitation learning for a robotic precision placement task.” 2014. Web. 24 Jan 2021.
Vancouver:
Van der Spek IT(. Imitation learning for a robotic precision placement task. [Internet] [Masters thesis]. Delft University of Technology; 2014. [cited 2021 Jan 24]. Available from: http://resolver.tudelft.nl/uuid:86815e55-bbba-45b4-915b-6f321b485940.
Council of Science Editors:
Van der Spek IT(. Imitation learning for a robotic precision placement task. [Masters Thesis]. Delft University of Technology; 2014. Available from: http://resolver.tudelft.nl/uuid:86815e55-bbba-45b4-915b-6f321b485940
24. Masood, Muhammad Arjumand. Algorithms for Discovering Collections of High-Quality and Diverse Solutions, With Applications to Bayesian Non-Negative Matrix Factorization and Reinforcement Learning.
Degree: PhD, 2019, Harvard University
URL: http://nrs.harvard.edu/urn-3:HUL.InstRepos:42029756
Subjects/Keywords: machine learning; NMF; non-negative matrix factorization; reinforcement learning; policy gradient; Stein discrepancy
…augmenting existing state of the art policy gradient methods for RL with a diversity-inducing term… …NMF) as well as an extension to on-policy and off-policy Reinforcement learning (… …Learning). The goal in Reinforcement Learning is to learn a policy (action selection… …This technique requires direct access to the environment (on-policy RL setting) in… …order to simulate the performance of the policy being optimized. In Chapter 7, we introduce a…
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Masood, M. A. (2019). Algorithms for Discovering Collections of High-Quality and Diverse Solutions, With Applications to Bayesian Non-Negative Matrix Factorization and Reinforcement Learning. (Doctoral Dissertation). Harvard University. Retrieved from http://nrs.harvard.edu/urn-3:HUL.InstRepos:42029756
Chicago Manual of Style (16th Edition):
Masood, Muhammad Arjumand. “Algorithms for Discovering Collections of High-Quality and Diverse Solutions, With Applications to Bayesian Non-Negative Matrix Factorization and Reinforcement Learning.” 2019. Doctoral Dissertation, Harvard University. Accessed January 24, 2021. http://nrs.harvard.edu/urn-3:HUL.InstRepos:42029756.
MLA Handbook (7th Edition):
Masood, Muhammad Arjumand. “Algorithms for Discovering Collections of High-Quality and Diverse Solutions, With Applications to Bayesian Non-Negative Matrix Factorization and Reinforcement Learning.” 2019. Web. 24 Jan 2021.
Vancouver:
Masood MA. Algorithms for Discovering Collections of High-Quality and Diverse Solutions, With Applications to Bayesian Non-Negative Matrix Factorization and Reinforcement Learning. [Internet] [Doctoral dissertation]. Harvard University; 2019. [cited 2021 Jan 24]. Available from: http://nrs.harvard.edu/urn-3:HUL.InstRepos:42029756.
Council of Science Editors:
Masood MA. Algorithms for Discovering Collections of High-Quality and Diverse Solutions, With Applications to Bayesian Non-Negative Matrix Factorization and Reinforcement Learning. [Doctoral Dissertation]. Harvard University; 2019. Available from: http://nrs.harvard.edu/urn-3:HUL.InstRepos:42029756
25. Liu, Stewart. Combining Retrospective Optimization and Gradient Search for Supply Chain Optimization.
Degree: Industrial Engineering & Operations Research, 2017, University of California – Berkeley
URL: http://www.escholarship.org/uc/item/3n52p4zj
Subjects/Keywords: Operations research; Data-Driven Optimization; Gradient Search; Multi-Echelon Supply Chain; Retrospective Optimization; State-Dependent Inventory Policy; Supply Chain Optimization
…for Hybrid Retrospective Optimization and Gradient Search In general, we seek the policy… …Per Period, Normal(32, 8) . . . . . . . . . . Gradient Search : Init = 30 period… …Gradient Search Gap . . . . . . . . . . . . . . . . Stochastic Lead Times, σ = 0.3… …Effects of Gradient Estimation Step Sizes . . . . . Tradeoff between Step Size and Starting… …approximate solution algorithm that utilizes a combination of traditional MILP solver and gradient…
Record Details
Similar Records
❌
APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager
APA (6th Edition):
Liu, S. (2017). Combining Retrospective Optimization and Gradient Search for Supply Chain Optimization. (Thesis). University of California – Berkeley. Retrieved from http://www.escholarship.org/uc/item/3n52p4zj
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Liu, Stewart. “Combining Retrospective Optimization and Gradient Search for Supply Chain Optimization.” 2017. Thesis, University of California – Berkeley. Accessed January 24, 2021. http://www.escholarship.org/uc/item/3n52p4zj.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Liu, Stewart. “Combining Retrospective Optimization and Gradient Search for Supply Chain Optimization.” 2017. Web. 24 Jan 2021.
Vancouver:
Liu S. Combining Retrospective Optimization and Gradient Search for Supply Chain Optimization. [Internet] [Thesis]. University of California – Berkeley; 2017. [cited 2021 Jan 24]. Available from: http://www.escholarship.org/uc/item/3n52p4zj.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Liu S. Combining Retrospective Optimization and Gradient Search for Supply Chain Optimization. [Thesis]. University of California – Berkeley; 2017. Available from: http://www.escholarship.org/uc/item/3n52p4zj
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation