Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

Sorted by: relevance · author · university · dateNew search

You searched for subject:(Checkpoint restart). Showing records 1 – 15 of 15 total matches.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters


Rice University

1. Vrvilo, Nick. Implementing Asynchronous Checkpoint/Restart for the Concurrent Collections Model.

Degree: MS, Engineering, 2014, Rice University

 It has been claimed that what simplifies parallelism can also simplify resilience. Based on that assertion, we present the Concurrent Collections programming model (CnC) as… (more)

Subjects/Keywords: Concurrent Collections; Resilience; Checkpoint/Restart

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Vrvilo, N. (2014). Implementing Asynchronous Checkpoint/Restart for the Concurrent Collections Model. (Masters Thesis). Rice University. Retrieved from http://hdl.handle.net/1911/88191

Chicago Manual of Style (16th Edition):

Vrvilo, Nick. “Implementing Asynchronous Checkpoint/Restart for the Concurrent Collections Model.” 2014. Masters Thesis, Rice University. Accessed April 14, 2021. http://hdl.handle.net/1911/88191.

MLA Handbook (7th Edition):

Vrvilo, Nick. “Implementing Asynchronous Checkpoint/Restart for the Concurrent Collections Model.” 2014. Web. 14 Apr 2021.

Vancouver:

Vrvilo N. Implementing Asynchronous Checkpoint/Restart for the Concurrent Collections Model. [Internet] [Masters thesis]. Rice University; 2014. [cited 2021 Apr 14]. Available from: http://hdl.handle.net/1911/88191.

Council of Science Editors:

Vrvilo N. Implementing Asynchronous Checkpoint/Restart for the Concurrent Collections Model. [Masters Thesis]. Rice University; 2014. Available from: http://hdl.handle.net/1911/88191


University of New Mexico

2. Ferreira, Kurt. Keeping checkpoint/restart viable for exascale systems.

Degree: Department of Computer Science, 2011, University of New Mexico

 Next-generation exascale systems, those capable of performing a quintillion operations per second, are expected to be delivered in the next 8-10 years. These systems, which… (more)

Subjects/Keywords: Checkpoint/ Restart; Reliability; Exascale; State-Machine Replication

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Ferreira, K. (2011). Keeping checkpoint/restart viable for exascale systems. (Doctoral Dissertation). University of New Mexico. Retrieved from http://hdl.handle.net/1928/17473

Chicago Manual of Style (16th Edition):

Ferreira, Kurt. “Keeping checkpoint/restart viable for exascale systems.” 2011. Doctoral Dissertation, University of New Mexico. Accessed April 14, 2021. http://hdl.handle.net/1928/17473.

MLA Handbook (7th Edition):

Ferreira, Kurt. “Keeping checkpoint/restart viable for exascale systems.” 2011. Web. 14 Apr 2021.

Vancouver:

Ferreira K. Keeping checkpoint/restart viable for exascale systems. [Internet] [Doctoral dissertation]. University of New Mexico; 2011. [cited 2021 Apr 14]. Available from: http://hdl.handle.net/1928/17473.

Council of Science Editors:

Ferreira K. Keeping checkpoint/restart viable for exascale systems. [Doctoral Dissertation]. University of New Mexico; 2011. Available from: http://hdl.handle.net/1928/17473

3. Popov, Mihail. Décomposition automatique des programmes parallèles pour l'optimisation et la prédiction de performance. : Automatic decomposition of parallel programs for optimization and performance prediction.

Degree: Docteur es, Informatique, 2016, Université Paris-Saclay (ComUE)

Dans le domaine du calcul haute performance, de nombreux programmes étalons ou benchmarks sont utilisés pour mesurer l’efficacité des calculateurs,des compilateurs et des optimisations de… (more)

Subjects/Keywords: Prédiction de performance; Parallélisme; Compilation; Optimisation; Checkpoint restart; Performance prediction; Parallelism; Compilation; Optimization; Checkpoint restart; 004.35

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Popov, M. (2016). Décomposition automatique des programmes parallèles pour l'optimisation et la prédiction de performance. : Automatic decomposition of parallel programs for optimization and performance prediction. (Doctoral Dissertation). Université Paris-Saclay (ComUE). Retrieved from http://www.theses.fr/2016SACLV087

Chicago Manual of Style (16th Edition):

Popov, Mihail. “Décomposition automatique des programmes parallèles pour l'optimisation et la prédiction de performance. : Automatic decomposition of parallel programs for optimization and performance prediction.” 2016. Doctoral Dissertation, Université Paris-Saclay (ComUE). Accessed April 14, 2021. http://www.theses.fr/2016SACLV087.

MLA Handbook (7th Edition):

Popov, Mihail. “Décomposition automatique des programmes parallèles pour l'optimisation et la prédiction de performance. : Automatic decomposition of parallel programs for optimization and performance prediction.” 2016. Web. 14 Apr 2021.

Vancouver:

Popov M. Décomposition automatique des programmes parallèles pour l'optimisation et la prédiction de performance. : Automatic decomposition of parallel programs for optimization and performance prediction. [Internet] [Doctoral dissertation]. Université Paris-Saclay (ComUE); 2016. [cited 2021 Apr 14]. Available from: http://www.theses.fr/2016SACLV087.

Council of Science Editors:

Popov M. Décomposition automatique des programmes parallèles pour l'optimisation et la prédiction de performance. : Automatic decomposition of parallel programs for optimization and performance prediction. [Doctoral Dissertation]. Université Paris-Saclay (ComUE); 2016. Available from: http://www.theses.fr/2016SACLV087


University of Toronto

4. Siniavine, Maxim. Seamless Kernel Updates.

Degree: 2012, University of Toronto

Kernel patches are frequently released to fix security vulnerabilities and bugs. However, users and system administrators often delay installing these updates because they require a… (more)

Subjects/Keywords: Updates; Operating Systems; Linux; Security; checkpoint; restart; 0984

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Siniavine, M. (2012). Seamless Kernel Updates. (Masters Thesis). University of Toronto. Retrieved from http://hdl.handle.net/1807/33532

Chicago Manual of Style (16th Edition):

Siniavine, Maxim. “Seamless Kernel Updates.” 2012. Masters Thesis, University of Toronto. Accessed April 14, 2021. http://hdl.handle.net/1807/33532.

MLA Handbook (7th Edition):

Siniavine, Maxim. “Seamless Kernel Updates.” 2012. Web. 14 Apr 2021.

Vancouver:

Siniavine M. Seamless Kernel Updates. [Internet] [Masters thesis]. University of Toronto; 2012. [cited 2021 Apr 14]. Available from: http://hdl.handle.net/1807/33532.

Council of Science Editors:

Siniavine M. Seamless Kernel Updates. [Masters Thesis]. University of Toronto; 2012. Available from: http://hdl.handle.net/1807/33532


Northeastern University

5. Cao, Jiajun. Transparent checkpointing over RDMA-based networks.

Degree: PhD, Computer Science Program, 2017, Northeastern University

 Fault tolerance for large-scale applications has long been an area of active research, as the size of the computation keeps growing. One of the components… (more)

Subjects/Keywords: checkpoint-restart; cloud computing; MPI; RDMA; supercomputing; virtualization

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Cao, J. (2017). Transparent checkpointing over RDMA-based networks. (Doctoral Dissertation). Northeastern University. Retrieved from http://hdl.handle.net/2047/D20290419

Chicago Manual of Style (16th Edition):

Cao, Jiajun. “Transparent checkpointing over RDMA-based networks.” 2017. Doctoral Dissertation, Northeastern University. Accessed April 14, 2021. http://hdl.handle.net/2047/D20290419.

MLA Handbook (7th Edition):

Cao, Jiajun. “Transparent checkpointing over RDMA-based networks.” 2017. Web. 14 Apr 2021.

Vancouver:

Cao J. Transparent checkpointing over RDMA-based networks. [Internet] [Doctoral dissertation]. Northeastern University; 2017. [cited 2021 Apr 14]. Available from: http://hdl.handle.net/2047/D20290419.

Council of Science Editors:

Cao J. Transparent checkpointing over RDMA-based networks. [Doctoral Dissertation]. Northeastern University; 2017. Available from: http://hdl.handle.net/2047/D20290419


University of California – Irvine

6. POURGHASSEMI, BEHNAM. cudaCR: An In-kernel Application-level Checkpoint/Restart Scheme for CUDA Applications.

Degree: Electrical and Computer Engineering, 2017, University of California – Irvine

 Fault-tolerance is becoming increasingly important as we enter the era of exascale computing. Increasing the number of cores results in a smaller mean time between… (more)

Subjects/Keywords: Computer engineering; Computer science; checkpoint/restart; Fault tolerance; GPU; soft-errors; supercomputer

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

POURGHASSEMI, B. (2017). cudaCR: An In-kernel Application-level Checkpoint/Restart Scheme for CUDA Applications. (Thesis). University of California – Irvine. Retrieved from http://www.escholarship.org/uc/item/7nc05406

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

POURGHASSEMI, BEHNAM. “cudaCR: An In-kernel Application-level Checkpoint/Restart Scheme for CUDA Applications.” 2017. Thesis, University of California – Irvine. Accessed April 14, 2021. http://www.escholarship.org/uc/item/7nc05406.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

POURGHASSEMI, BEHNAM. “cudaCR: An In-kernel Application-level Checkpoint/Restart Scheme for CUDA Applications.” 2017. Web. 14 Apr 2021.

Vancouver:

POURGHASSEMI B. cudaCR: An In-kernel Application-level Checkpoint/Restart Scheme for CUDA Applications. [Internet] [Thesis]. University of California – Irvine; 2017. [cited 2021 Apr 14]. Available from: http://www.escholarship.org/uc/item/7nc05406.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

POURGHASSEMI B. cudaCR: An In-kernel Application-level Checkpoint/Restart Scheme for CUDA Applications. [Thesis]. University of California – Irvine; 2017. Available from: http://www.escholarship.org/uc/item/7nc05406

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation


North Carolina State University

7. Wang, Chao. Transparent Fault Tolerance for Job Healing in HPC Environments.

Degree: PhD, Computer Science, 2009, North Carolina State University

 As the number of nodes in high-performance computing environments keeps increasing, faults are becoming common place causing losses in intermediate results of HPC jobs. Furthermore,… (more)

Subjects/Keywords: job input data; fault tolerance; high-performance computing; fault resilience; checkpoint/restart

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Wang, C. (2009). Transparent Fault Tolerance for Job Healing in HPC Environments. (Doctoral Dissertation). North Carolina State University. Retrieved from http://www.lib.ncsu.edu/resolver/1840.16/4437

Chicago Manual of Style (16th Edition):

Wang, Chao. “Transparent Fault Tolerance for Job Healing in HPC Environments.” 2009. Doctoral Dissertation, North Carolina State University. Accessed April 14, 2021. http://www.lib.ncsu.edu/resolver/1840.16/4437.

MLA Handbook (7th Edition):

Wang, Chao. “Transparent Fault Tolerance for Job Healing in HPC Environments.” 2009. Web. 14 Apr 2021.

Vancouver:

Wang C. Transparent Fault Tolerance for Job Healing in HPC Environments. [Internet] [Doctoral dissertation]. North Carolina State University; 2009. [cited 2021 Apr 14]. Available from: http://www.lib.ncsu.edu/resolver/1840.16/4437.

Council of Science Editors:

Wang C. Transparent Fault Tolerance for Job Healing in HPC Environments. [Doctoral Dissertation]. North Carolina State University; 2009. Available from: http://www.lib.ncsu.edu/resolver/1840.16/4437

8. Tao, Dingwen. Fault Tolerance for Iterative Methods in High-Performance Computing.

Degree: Computer Science, 2018, University of California – Riverside

 Iterative methods are commonly used approaches to solve large, sparse linear systems, which are fundamental operations for many modern scientific simulations. When the large-scale iterative… (more)

Subjects/Keywords: Computer science; Checkpoint/Restart; Fault Tolerance; High Performance Computing; Iterative Methods; Performance; Resilience

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Tao, D. (2018). Fault Tolerance for Iterative Methods in High-Performance Computing. (Thesis). University of California – Riverside. Retrieved from http://www.escholarship.org/uc/item/4fc474t2

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

Tao, Dingwen. “Fault Tolerance for Iterative Methods in High-Performance Computing.” 2018. Thesis, University of California – Riverside. Accessed April 14, 2021. http://www.escholarship.org/uc/item/4fc474t2.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

Tao, Dingwen. “Fault Tolerance for Iterative Methods in High-Performance Computing.” 2018. Web. 14 Apr 2021.

Vancouver:

Tao D. Fault Tolerance for Iterative Methods in High-Performance Computing. [Internet] [Thesis]. University of California – Riverside; 2018. [cited 2021 Apr 14]. Available from: http://www.escholarship.org/uc/item/4fc474t2.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

Tao D. Fault Tolerance for Iterative Methods in High-Performance Computing. [Thesis]. University of California – Riverside; 2018. Available from: http://www.escholarship.org/uc/item/4fc474t2

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation


University of Sydney

9. Egwutuoha, Ifeanyi Paulinus. A proactive fault tolerance framework for high performance computing (HPC) systems in the cloud .

Degree: 2013, University of Sydney

 High Performance Computing (HPC) systems have been widely used by scientists and researchers in both industry and university laboratories to solve advanced computation problems. Most… (more)

Subjects/Keywords: HPC systems in the cloud; Fault tolerance; Cloud computing; Checkpoint/restart; HaaS

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Egwutuoha, I. P. (2013). A proactive fault tolerance framework for high performance computing (HPC) systems in the cloud . (Thesis). University of Sydney. Retrieved from http://hdl.handle.net/2123/11484

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

Egwutuoha, Ifeanyi Paulinus. “A proactive fault tolerance framework for high performance computing (HPC) systems in the cloud .” 2013. Thesis, University of Sydney. Accessed April 14, 2021. http://hdl.handle.net/2123/11484.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

Egwutuoha, Ifeanyi Paulinus. “A proactive fault tolerance framework for high performance computing (HPC) systems in the cloud .” 2013. Web. 14 Apr 2021.

Vancouver:

Egwutuoha IP. A proactive fault tolerance framework for high performance computing (HPC) systems in the cloud . [Internet] [Thesis]. University of Sydney; 2013. [cited 2021 Apr 14]. Available from: http://hdl.handle.net/2123/11484.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

Egwutuoha IP. A proactive fault tolerance framework for high performance computing (HPC) systems in the cloud . [Thesis]. University of Sydney; 2013. Available from: http://hdl.handle.net/2123/11484

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation


Université de Grenoble

10. Bouguerra, Mohamed Slim. Tolérance aux pannes dans des environnements de calcul parallèle et distribué : optimisation des stratégies de sauvegarde/reprise et ordonnancement : Fault tolerance in the parallel and distributed environments : optimizing the checkpoint restart strategy and scheduling.

Degree: Docteur es, Informatique, 2012, Université de Grenoble

Le passage de l'échelle des nouvelles plates-formes de calcul parallèle et distribué soulève de nombreux défis scientifiques. À terme, il est envisageable de voir apparaître… (more)

Subjects/Keywords: Tolérance aux pannes; Sauvegarde et reprise; Ordonnancement multi-objectifs; Grille de calcul; Fiabilité; Fault tolerance; Checkpoint restart; Multi-objective scheduling; HPC

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Bouguerra, M. S. (2012). Tolérance aux pannes dans des environnements de calcul parallèle et distribué : optimisation des stratégies de sauvegarde/reprise et ordonnancement : Fault tolerance in the parallel and distributed environments : optimizing the checkpoint restart strategy and scheduling. (Doctoral Dissertation). Université de Grenoble. Retrieved from http://www.theses.fr/2012GRENM023

Chicago Manual of Style (16th Edition):

Bouguerra, Mohamed Slim. “Tolérance aux pannes dans des environnements de calcul parallèle et distribué : optimisation des stratégies de sauvegarde/reprise et ordonnancement : Fault tolerance in the parallel and distributed environments : optimizing the checkpoint restart strategy and scheduling.” 2012. Doctoral Dissertation, Université de Grenoble. Accessed April 14, 2021. http://www.theses.fr/2012GRENM023.

MLA Handbook (7th Edition):

Bouguerra, Mohamed Slim. “Tolérance aux pannes dans des environnements de calcul parallèle et distribué : optimisation des stratégies de sauvegarde/reprise et ordonnancement : Fault tolerance in the parallel and distributed environments : optimizing the checkpoint restart strategy and scheduling.” 2012. Web. 14 Apr 2021.

Vancouver:

Bouguerra MS. Tolérance aux pannes dans des environnements de calcul parallèle et distribué : optimisation des stratégies de sauvegarde/reprise et ordonnancement : Fault tolerance in the parallel and distributed environments : optimizing the checkpoint restart strategy and scheduling. [Internet] [Doctoral dissertation]. Université de Grenoble; 2012. [cited 2021 Apr 14]. Available from: http://www.theses.fr/2012GRENM023.

Council of Science Editors:

Bouguerra MS. Tolérance aux pannes dans des environnements de calcul parallèle et distribué : optimisation des stratégies de sauvegarde/reprise et ordonnancement : Fault tolerance in the parallel and distributed environments : optimizing the checkpoint restart strategy and scheduling. [Doctoral Dissertation]. Université de Grenoble; 2012. Available from: http://www.theses.fr/2012GRENM023

11. Hamouda, Sara S. Resilience in high-level parallel programming languages .

Degree: 2019, Australian National University

 The consistent trends of increasing core counts and decreasing mean-time-to-failure in supercomputers make supporting task parallelism and resilience a necessity in HPC programming models. Given… (more)

Subjects/Keywords: APGAS; Resilience; Fault Tolerance; X10; MPI-ULFM; Transactional Memory; Checkpoint-Restart; Async-Finish; Task-Based Runtime Systems; Termination Detection; Taxonomy of Resilient Programming Models

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Hamouda, S. S. (2019). Resilience in high-level parallel programming languages . (Thesis). Australian National University. Retrieved from http://hdl.handle.net/1885/164137

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

Hamouda, Sara S. “Resilience in high-level parallel programming languages .” 2019. Thesis, Australian National University. Accessed April 14, 2021. http://hdl.handle.net/1885/164137.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

Hamouda, Sara S. “Resilience in high-level parallel programming languages .” 2019. Web. 14 Apr 2021.

Vancouver:

Hamouda SS. Resilience in high-level parallel programming languages . [Internet] [Thesis]. Australian National University; 2019. [cited 2021 Apr 14]. Available from: http://hdl.handle.net/1885/164137.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

Hamouda SS. Resilience in high-level parallel programming languages . [Thesis]. Australian National University; 2019. Available from: http://hdl.handle.net/1885/164137

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation


Northeastern University

12. Arya, Kapil. User-space process virtualization in the context of checkpoint-restart and virtual machines.

Degree: PhD, Department of Computer Science, 2015, Northeastern University

Checkpoint-Restart is the ability to save a set of running processes to a checkpoint image on disk, and to later restart them from the disk.… (more)

Subjects/Keywords: Checkpoint-restart; Distributed computing; Fault-tolerance; Paging; Virtualization; Virtual machines; Computer Sciences; Virtual computer systems; Programming; Data recovery (Computer science); Application software; Programming; Electronic data processing; Distributed processing

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Arya, K. (2015). User-space process virtualization in the context of checkpoint-restart and virtual machines. (Doctoral Dissertation). Northeastern University. Retrieved from http://hdl.handle.net/2047/d20005096

Chicago Manual of Style (16th Edition):

Arya, Kapil. “User-space process virtualization in the context of checkpoint-restart and virtual machines.” 2015. Doctoral Dissertation, Northeastern University. Accessed April 14, 2021. http://hdl.handle.net/2047/d20005096.

MLA Handbook (7th Edition):

Arya, Kapil. “User-space process virtualization in the context of checkpoint-restart and virtual machines.” 2015. Web. 14 Apr 2021.

Vancouver:

Arya K. User-space process virtualization in the context of checkpoint-restart and virtual machines. [Internet] [Doctoral dissertation]. Northeastern University; 2015. [cited 2021 Apr 14]. Available from: http://hdl.handle.net/2047/d20005096.

Council of Science Editors:

Arya K. User-space process virtualization in the context of checkpoint-restart and virtual machines. [Doctoral Dissertation]. Northeastern University; 2015. Available from: http://hdl.handle.net/2047/d20005096

13. Abeyratne, Sandunmalee. Studies in Exascale Computer Architecture: Interconnect, Resiliency, and Checkpointing.

Degree: PhD, Computer Science & Engineering, 2017, University of Michigan

 Today’s supercomputers are built from the state-of-the-art components to extract as much performance as possible to solve the most computationally intensive problems in the world.… (more)

Subjects/Keywords: exascale supercomputer architecture; kilo-core on-chip interconnect topology; checkpoint/restart fault tolerance; Computer Science; Engineering

…3.1 3.2 Fault Tolerance in High Performance Computing . . Checkpoint/Restart… …compute node’s storage. Checkpoint/restart is a key ingredient in attaining resilience, but it… …information of fault tolerance, checkpoint/restart, non-volatile memories and flash. Chapter III… …background into fault tolerance, checkpoint/restart, and flash memory. 2.1 Fault Tolerance in… …be higher. 2.2 Checkpoint/Restart The most common approach to fault tolerance in high… 

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Abeyratne, S. (2017). Studies in Exascale Computer Architecture: Interconnect, Resiliency, and Checkpointing. (Doctoral Dissertation). University of Michigan. Retrieved from http://hdl.handle.net/2027.42/137096

Chicago Manual of Style (16th Edition):

Abeyratne, Sandunmalee. “Studies in Exascale Computer Architecture: Interconnect, Resiliency, and Checkpointing.” 2017. Doctoral Dissertation, University of Michigan. Accessed April 14, 2021. http://hdl.handle.net/2027.42/137096.

MLA Handbook (7th Edition):

Abeyratne, Sandunmalee. “Studies in Exascale Computer Architecture: Interconnect, Resiliency, and Checkpointing.” 2017. Web. 14 Apr 2021.

Vancouver:

Abeyratne S. Studies in Exascale Computer Architecture: Interconnect, Resiliency, and Checkpointing. [Internet] [Doctoral dissertation]. University of Michigan; 2017. [cited 2021 Apr 14]. Available from: http://hdl.handle.net/2027.42/137096.

Council of Science Editors:

Abeyratne S. Studies in Exascale Computer Architecture: Interconnect, Resiliency, and Checkpointing. [Doctoral Dissertation]. University of Michigan; 2017. Available from: http://hdl.handle.net/2027.42/137096

14. Bentria, Dounia. Combining checkpointing and other resilience mechanisms for exascale systems : L'utilisation conjointe de mécanismes de sauvegarde de points de reprise (checkpoints) et d'autres mécanismes de résilience pour les systèmes exascales.

Degree: Docteur es, Informatique, 2014, Lyon, École normale supérieure

Dans cette thèse, nous nous sommes intéressés aux problèmes d'ordonnancement et d'optimisation dans des contextes probabilistes. Les contributions de cette thèse se déclinent en deux… (more)

Subjects/Keywords: Tolérance aux pannes; Exascale; Optimisation; Ordonnancement; Sauvegarde de points de reprise (checkpoints) et de redémarrage; Réplication; Prédiction de fautes; Erreurs silencieuses; Traitement de requêtes; Opérateurs booléens; Énergie; Algorithme glouton; Partage de données; Algorithmique probabiliste; Fault tolerance; Exascale; Optimization; Scheduling; Checkpoint/restart; Replication; Fault prediction; Silent errors; Query processing; Boolean operators; Energy; Greedy algorithm; Data sharing

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Bentria, D. (2014). Combining checkpointing and other resilience mechanisms for exascale systems : L'utilisation conjointe de mécanismes de sauvegarde de points de reprise (checkpoints) et d'autres mécanismes de résilience pour les systèmes exascales. (Doctoral Dissertation). Lyon, École normale supérieure. Retrieved from http://www.theses.fr/2014ENSL0971

Chicago Manual of Style (16th Edition):

Bentria, Dounia. “Combining checkpointing and other resilience mechanisms for exascale systems : L'utilisation conjointe de mécanismes de sauvegarde de points de reprise (checkpoints) et d'autres mécanismes de résilience pour les systèmes exascales.” 2014. Doctoral Dissertation, Lyon, École normale supérieure. Accessed April 14, 2021. http://www.theses.fr/2014ENSL0971.

MLA Handbook (7th Edition):

Bentria, Dounia. “Combining checkpointing and other resilience mechanisms for exascale systems : L'utilisation conjointe de mécanismes de sauvegarde de points de reprise (checkpoints) et d'autres mécanismes de résilience pour les systèmes exascales.” 2014. Web. 14 Apr 2021.

Vancouver:

Bentria D. Combining checkpointing and other resilience mechanisms for exascale systems : L'utilisation conjointe de mécanismes de sauvegarde de points de reprise (checkpoints) et d'autres mécanismes de résilience pour les systèmes exascales. [Internet] [Doctoral dissertation]. Lyon, École normale supérieure; 2014. [cited 2021 Apr 14]. Available from: http://www.theses.fr/2014ENSL0971.

Council of Science Editors:

Bentria D. Combining checkpointing and other resilience mechanisms for exascale systems : L'utilisation conjointe de mécanismes de sauvegarde de points de reprise (checkpoints) et d'autres mécanismes de résilience pour les systèmes exascales. [Doctoral Dissertation]. Lyon, École normale supérieure; 2014. Available from: http://www.theses.fr/2014ENSL0971

15. Calhoun, Jon Cameron. From detection to optimization: impact of soft errors on high-performance computing applications.

Degree: PhD, Computer Science, 2017, University of Illinois – Urbana-Champaign

 As high-performance computing (HPC) continues to progress, constraints on HPC system design forces the handling of errors to higher levels in the software stack. Of… (more)

Subjects/Keywords: High-performance computing; Fault tolerance; Silent data corruption; Soft errors; Error detection; Error recovery; Fault injection; Error propagation; Lossy compression; Checkpoint-restart

checkpoint-restart. Checkpoint-restart HPC checkpoint-restart relies on a short detection latency… …x5D;. System-level checkpoint-restart [58, 87, 45] offers the ability to recover… …globally coordinated checkpoint-restart limits application performance due to coordination and I… …asynchronous checkpoint-restart schemes have been developed [83, 94, 42], but are… …logged communication. Although application-based checkpoint-restart and asynchronous checkpoint… 

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Calhoun, J. C. (2017). From detection to optimization: impact of soft errors on high-performance computing applications. (Doctoral Dissertation). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/98379

Chicago Manual of Style (16th Edition):

Calhoun, Jon Cameron. “From detection to optimization: impact of soft errors on high-performance computing applications.” 2017. Doctoral Dissertation, University of Illinois – Urbana-Champaign. Accessed April 14, 2021. http://hdl.handle.net/2142/98379.

MLA Handbook (7th Edition):

Calhoun, Jon Cameron. “From detection to optimization: impact of soft errors on high-performance computing applications.” 2017. Web. 14 Apr 2021.

Vancouver:

Calhoun JC. From detection to optimization: impact of soft errors on high-performance computing applications. [Internet] [Doctoral dissertation]. University of Illinois – Urbana-Champaign; 2017. [cited 2021 Apr 14]. Available from: http://hdl.handle.net/2142/98379.

Council of Science Editors:

Calhoun JC. From detection to optimization: impact of soft errors on high-performance computing applications. [Doctoral Dissertation]. University of Illinois – Urbana-Champaign; 2017. Available from: http://hdl.handle.net/2142/98379

.